Is there a way to identify common combinations of words that occur in titles, maybe a whitelist? If -John Smith- is a common combination, then tag for “John Smith” instead of/along with “John”, “Smith”
This topic was modified 2 years, 9 months ago by Nahum.
At the moment, this isn’t something Title to Tags can do. Really, it’s just checking each word against a list of stop words to see if it should _not_ be included in the tags that get created. Teaching it to look out for specific phrases would be an entirely different exercise.
The problem with phrases is simply that there’s no logical way of knowing, from a string of characters, what is a significant “phrase” vs just a word in the series. I’ve had it in mind that I could incorporate Open Calais to look for newsworthy or generally significant phrases. But something like “John Smith,” unless you’re speaking about a famous John Smith, wouldn’t be caught either.
Hope this helps? Would be glad to hear back from you about what your specific needs are?
Thanks for the quick reply. It would be for news titles so yes “famous” names, places (Grand Canyon, New York, for example). Which is why I maybe thought a whitelist we could supply the places, names, etc.
This reply was modified 2 years, 9 months ago by Nahum.