In September 2014 the internet had 1 billion websites and it’s still expanding rapidly. The 3.6 billion people who are connected to the web have posted 780 Instagram photos, sent 7.605 tweets and added 44 TB of internet traffic to the web in the last second.
To find the right information, that same group just queried Google 60.192 times and that is where you come in.
Organizations are adding data to the web and their internal systems as well, but a decent search for that data set is currently missing. It's up to you to build the ‘Library of Alexandria’ for organizations. A well-functioning search index for that library is nice, a 'machine learning auto find the right information' tool would be even better.
Goal: Supporting the support employees so that the incoming emails / chats are being answered automagically, ergo automate all the things!
Nutch is a likely candidate for the crawler you have to build, Elastic is the likely search engine you can use. When it comes to the AI/Machine learning part, things get tricky. Do you want to use the open source Tensor Flow library by Google, or do you want to use the closed source IBM Watson platform? Or are the machine learning cloud services from Google the right tool for the job? Amazon AWS might be an option as well, or maybe Databricks and Spark for the perfect combination.
As you see, there are a lot of tools to help you out but you will have to research the best tools for the job. Build a scalable Library of Alexandria which lets organizations add data sources (even if they’re behind a login) and build a search for it (we have above average knowledge so we can help you out here). Input can be either a query or a text. Output can be rated and the machine learning algorithm helps with the matching and learns from the rating. The system can be the basis for a chat robot or can enhance an existing support system. Of course, all your work will be open source so the world can enjoy the fruits of your labor.
You will have noticed by now that this is not your average graduation project / thesis. You need to bring your A game and you might even need to bring in a partner to do the research together.
Search, big data, machine learning, artificial intelligence, algorithm design
Besides a compensation, you can also enjoy: