I asked Windows Cortana three questions that I felt should be easily answerable by any decent AI. It didn't get one right.
1) What was the date 100,000 days ago?
2) How old was Paul McCartney when John Lennon was killed?
3) How many days ago was the first moon landing?
Conversation
The first question is fairly simple. What is today's date minus 100,000 days.
The second question was a little more complex. It would need to understand associations between the date of one event and a person's age during that event.
The last question seems fairly easy.
2
I'm actually pretty shocked at how poorly current AI understands language and context. I thought AI such as Cortana would be able to understand language and context but I feel like it is really just querying a large database of previous answers.
Replying to
I asked my Google Home #2 said Paul was 38
#3, said 18345 days (I think)
Didn't ask #1
1
Replying to
NLP is hard! The gold standard to test against human question / answer style tasks is #SQuAD by . There is a really nice explorer of the questions (and range of human responses) here:
rajpurkar.github.io/SQuAD-explorer/


