User-Focused Marian Code

The User-Focused Marian project, funded by the Connecting Europe Facility, added the following features to Marian:
1. Improved documentation (see doc/ and
2. Better factors.
3. Using factors for terminology.
4. Run-time domain adaptation.
5. 8-bit GPU support.

As of January 31, 2022, features 1-3 are merged upstream into .

Runtime domain adaptation appears in the master branch of this repository, as well as recent master so it has features 1-4.

The 8-bit GPU implementation is found in branch 8bitgpu. 8-bit support requires quantizing tensors (which has some overhead) followed by faster matrix multiplication. This is faster than the implementation in Marian that existed at the start of the project. However, NVidia came by and contributed FP16 improvements that project parters also assisted in integrating. These FP16 improvements made it faster than the 8-bit version, even though we consulted with NVidia on the 8-bit version (including tracking down the guy that wrote GEMM for the A100). Therefore the 8-bit branch was not merged and remains as an isolated branch: git checkout 8bitgpu. The in that branch contains instructions on usage.