Conversation

Does anyone have a Python method that extracts hashtags from text using the same rules that Twitter applies for their hashtag extraction? I've checked Stackoverflow but haven't found any updated code.
2
10
def extract_hashtags(text): '''Extract hashtags''' valid_tags = set() tags = re.findall(r'#(\w+)', text) for tag in tags: if tag.isdigit(): continue else: valid_tags.add(tag) return valid_tags
2
2
This will ignore hashtags that are only digits. It seems hashtags can start with numbers but they can't just be numbers only. At least this is a starting point and it appears to handle characters from other languages besides English.
1
1