We decided to focus on artificially-constructed minimal pair datasets: Pairs of sentences that differ by a single edit, where that edit causes one of the sentences to be ungrammatical for a clearly identifiable reason.pic.twitter.com/KFYQ3HR5Cx
U tweetove putem weba ili aplikacija drugih proizvođača možete dodati podatke o lokaciji, kao što su grad ili točna lokacija. Povijest lokacija tweetova uvijek možete izbrisati. Saznajte više
We decided to focus on artificially-constructed minimal pair datasets: Pairs of sentences that differ by a single edit, where that edit causes one of the sentences to be ungrammatical for a clearly identifiable reason.pic.twitter.com/KFYQ3HR5Cx
If an LM reliably assigns higher likelihood to the better sentence, that suggests that it recognizes the underlying grammatical distinction.
This kind of data is already established in this literature—our contribution is to try to put together a large collection of these minimal pair datasets that offer a broad survey of English syntax.
Our new benchmark, BLiMP, contains 67 of these minimal pair datasets, and we try to include at least one for nearly all of the phenomena that would warrant a section in a typical English syntax textbook.
We try to use comparable methods and comparable lexical content (i.e., words) across all 67, but we see striking differences. Big Transformer models are at human level on some phenomena, and closer to chance on others.pic.twitter.com/1D6lNYR1sa
If you're curious about which phenomena tend to emerge earlier in the course of NN learning, we have some preliminary results on that too for a mid-size LSTM!pic.twitter.com/MmxBd2MkJK
This emerged from a collaboration over the summer led by @a_stadt and @AliciaVParrish, with @liu_haokun, @anhadmy, Wei Peng, and Sheng-Fu Wang.
Try the data out, or take a look at our draft paper, here:https://github.com/alexwarstadt/blimp …
This is a snapshot of ongoing work—comments welcome! Also, if you're coming to the LSA, stop by our talk on this work at SCiL!
This looks like a great dataset to inform adversarial approaches
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.