Tweetovi
- Tweetovi, trenutna stranica.
- Tweetovi i odgovori
- Medijski sadržaj
Blokirali ste korisnika/cu @tpimentelms
Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @tpimentelms
-
Prikvačeni tweet
Our new work on measuring phonotactic complexity and its trade-offs has just been published in TACL
This is joint work with @BrianRhoArc and@ryandcotterell. https://www.mitpressjournals.org/doi/full/10.1162/tacl_a_00296 … Wanna know more about it? Follow the thread:Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Finally, we tried including phoneme features explicitly in the LM embedding representations, but this yielded little benefit except in an extremely low-resource condition.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
This correlation persists when controlling for characteristics of long words. Our proposed measure also correlates with word length both across and inside language families. Standard measures of phonotactic complexity do not show such correlation intra language families.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Why should I care? Weak trade-offs have been found between other measures of phonetic complexity (e.g. size of vowels inventory) and word length. Entropy (bits per phoneme) shows a strong negative correlation with average word length (-.74).pic.twitter.com/mNmJOmUpAA
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
(Pros continues): - requires relatively modest annotation to be computed; - it permits a straightforward cross-linguistic comparison; - uses a very cool LSTM (actually it uses a standard one, but they are cool).
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Pros in our approach: - it considers frequency distribution for different phonemes in a language; - covers corner cases (e.g. borrowed words and phonemes) nicely at the rate they appear; - captures long distance dependencies in words (e.g. vowel harmony);
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We do an extensive review of relevant literature: - what is phonotactics and how has it been modeled? - what is complexity and how might one measure it? - the role of information theory in linguistic complexity.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
What are phonotactics? The study of which sequences of segments constitute natural-sounding words. The classic example (due to Chomsky and Halle) is: - brick: actual English word; - blick: not a word, but judged to be grammatical by speakers; - bnick: not a possible word.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
In this paper, we propose using bits-per-phoneme as a measure of phonotactic complexity. We train a phoneme-level LSTM language model, and use it to estimate its cross-entropy on held-out data. This gives us an upper-bound on the actual "phonotactic entropy" of a language.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Tiago Pimentel proslijedio/la je Tweet
Please join SIGTYP, ACL's newest SIG! We are dedicated to the computational study of linguistic typology and multilingual NLP. Sign up on our website https://sigtyp.github.io/ . We are hosting a workshop (SIGTYP 2020) at
@emnlp2020. Consider submitting (https://sigtyp.github.io/ws2020.html )!Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Tiago Pimentel proslijedio/la je Tweet
Tiago Pimentel presenting best paper nominated paper
@ACL2019_Italy on work with@aryamccarthy and others#acl2019nlppic.twitter.com/llRXl28Le6
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Our work is a candidate for best paper in
@ACL2019_Italy :) Read a short description in the thread! The whole paper in arXiv: https://arxiv.org/abs/1906.05906 ! And come to our presentation on Monday afternoon! Ps: Also read the other candidates! List is here: http://www.acl2019.org/EN/nominations-for-acl-2019-best-paper-awards.xhtml …https://twitter.com/tpimentelms/status/1140728629862838273 …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Drawbacks: - We used English word2vec as meaning vectors for all languages in NorthEuraLex. -> definitely not
#BenderRule - Using word2vec vectors might not fully capture meaning, so we might be underestimating systematicity. - Our mutual information measurements are estimates.Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Advantages against previous work: - Easy to control for trivial factors of systematicity. - Can capture non-linear relationships between meaning and form. - Bits per phone is a more tangible measurement unit than Pearson correlation results.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Results 3: Finding fantastic phonesthemes. We can use our technique to get a list of phonesthemes in English, German and Dutch.pic.twitter.com/keRc6slDpY
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Results 2: Using NorthEuraLex, we find systematicity in 87 (out of 106) analysed languages. After controlling for POS tags, though, we only find it for 17.pic.twitter.com/SNEAUHWQ0M
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Results 1: Using CELEX, we find systematicity (statistically significant nonzero mutual information I(V; W)) in English, German and Dutch. We also find systematicity after controlling for POS tags.pic.twitter.com/6QiZlBm9aW
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
How do we represent forms? We use words' phone string representation in the International Phonetic Alphabet (IPA), How do we represent meaning? We use Word2Vec distributed representations.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
By framing systematicity as mutual information, we can straightforwardly control for known (maybe trivial) factors of systematicity (e.g. grammatical class in Portuguese / Spanish). I(W;V|G) = H(W|G) - H(W| V, G)
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We estimate mutual information as the difference in entropy of two phone-level (form; W) LSTM language models---one of which is conditioned on the semantic representation (meaning; V). I(W;V) = H(W) - H(W| V)pic.twitter.com/C19pNG6pWo
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.