Marcin Junczys-Dowmunt (Marian NMT)

@marian_nmt

NLP. NMT. Main author of Marian NMT. Research Scientist at Microsoft Translator. Non-NLP stuff on

Vrijeme pridruživanja: travanj 2017.

Tweetovi

Blokirali ste korisnika/cu @marian_nmt

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @marian_nmt

  1. Prikvačeni tweet

    Marian 1.7.0 released - now with built-in SentencePiece model training. Say goodbye to preprocessing! And here's a Marian/SentencePiece/SacreBLEU tutorial (thanks to and for their great tools):

    Poništi
  2. And somehow I managed not to have heard about the back story before.

    Prikaži ovu nit
    Poništi
  3. This paper from the proceedings is fascinating: Human Evaluation of Neural Machine Translation: The Case of Deep Learning "Deep learning" here is the title of a book MT translated and human post-edited book. The paper describes the effort.

    Prikaži ovu nit
    Poništi
  4. 2) This is my first time blogging. It's way too wordy, took way too much time, got partially crowd-sourced in twitter comments on the first version (thanks ). Still a load of fun. Now this needs to sit for while and might grow up to eventually become a paper.

    Prikaži ovu nit
    Poništi
  5. 1) Oh my. New and improved version online. Whole large new section on measuring lexical diversity for real MT systems (as requested ). Attempts to get it right, succeeds only half (if at all) and still coming up with similar results as before.

    Prikaži ovu nit
    Poništi
  6. proslijedio/la je Tweet
    24. sij

    Dia daoibh! Tá Gaeilge againn! Microsoft Translator is pleased to add Irish to its family of languages. Look for Irish in Microsoft products as a translation language, in the Azure Cognitive Services Translator API, and in the Microsoft Translator app.

    Poništi
  7. Corrections corner: I wrote that Vanmassenhove et al. (, ) used 100K training sentences. That's wrong, they used 1M sentences. Apologies for that. Unfortunately, I saw 's tweet about that only now. It's corrected in the post as well.

    Prikaži ovu nit
    Poništi
  8. Comments via github under the blog post are now enabled. If you have beef with the conclusions/results that warrants a longer response, let's move it under the post.

    Poništi
  9. Also repo for full reproduciblity: Just run make -j

    Prikaži ovu nit
    Poništi
  10. So, a blog post. Not connected to Marian at all, surprisingly, but about a MT quality rabbit hole I went down the last couple of days. "Is MT really lexically less diverse than human translation?" If that's useful in any way I might keep on blogging.

    Prikaži ovu nit
    Poništi
  11. Wow. That's impressive! Now I am afraid of the coming GitHub issues :)

    Poništi
  12. Siileeeence! Siiiileeeence! The Fridays have it. The Fridays have it. Looks like the official part of the MTMA 2020 is going to take place from Tuesday May 26th to Friday May 29th. We might be tempted to organize something social on Saturday. Thanks to everyone who voted!

    Poništi
  13. 1 hour left. At the moment the Fridays are leading with a tiny majority. But that's how elections are won these days.

    Poništi
  14. 6 hours to go. I think y'all are just trolling me, tied again.

    Poništi
  15. And last self-retweet before the poll ends. Seems to me America is divided on this important question.

    Poništi
  16. Daily poll retweet, you are not making this any easier with that tie

    Poništi
  17. This poll is also our first, for now unofficial announcement. Planning is starting now :) Redmond has a pretty good bus connection with Seattle (my daily commute), so that's also a good occasion to get to know the city before ACL 2020 in July.

    Prikaži ovu nit
    Poništi
  18. The Machine Translation Marathon of the Americas 2020 is going to take place at Microsoft HQ in Redmond, WA. We start on Tuesday, May 26th, a day after Memorial Day. But when should we stop? Do you want to finish after 4 days on Friday or after 5 days of MT goodness on Saturday?

    Prikaži ovu nit
    Poništi
  19. proslijedio/la je Tweet
    16. sij

    NEW! Live Captions allows you to bring in real-time captions & translations from Microsoft Translator 😱 🌎 Captions in 63 languages ⏸ Pause/Play captions 🖊 Highlight 📝 Auto-save transcript 👉

    Poništi
  20. proslijedio/la je Tweet
    16. sij

    Super nice feature for note-taking in class. In the teacher's language or your own. Builds on the great speech transcription and translation capabilities of

    Poništi
  21. argh, "that"* and "go on an ..."*.

    Prikaži ovu nit
    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·