Usecase Global Cargo Care
Challenge
Global Cargo care receives large volumes of emails on a daily basis from incoming and outgoing ships. These emails contain information regarding their departure port, arrival port, arrival time, cargo, and weight of the ships. These are typically written freely by the sender, lacking a fixed format and often containing abbreviations and typographical errors.
Currently, an entire department of people is needed to parse these emails and manually document the important fields in the database. Apart from being tedious work, it also results in high costs and hinders the scalability of the system. Additionally, the focus of such employees decreases over time, leading to the input of inaccurate data. This can have serious problemetic consequences in practice.
The Datacation team was asked to design and developing a Natural Language Processing (NLP) module to automate this process.
Process
The total solution encompasses aspects of data engineering and data science, making it an ideal project for Datacation.
We started with the data engineering part. The work done in this phase determines how data flows, what data transformations are performed, and where it is subsequently stored. The proces needs to handle a large volume of emails, so robustness was crucial.
Once these data flows were well-organized, we started developing the algorithm. We made the model self-learning, where it periodically extracts data from the database and retrains itself. Additionally, we developed a dashboard that displays the performance of the model and facilitates manual retraining.
After developing the algorithm, Datacation also took care of the integration phase, where the functionality of the algorithm is integrated into Global Cargo Care's operations, so it actually can be used.
Solution
The model is currently achieving an accuracy of 91.3%. Over time, with more data pushed through the model, this accuracy can get even higher. We are confident that we have already made a big improvement compared to manually processing the emails.
With recent developments in the field of Natural Language Processing (NLP) new architectures are possible. Together with Global Cargo Care we will keep exploring these new possibilities, to push the performance of the model to a new level.