Category Archives: data-integration

XData project Data integration on hadoop cluster

The XData project is a french collaborative project between industrials, startups as well as big companies, and academics. Its main objective is to develop innovative  commercial product constructed from the integration of private data with open data.

I mostly work on the xdata “movement analytics” application. More specifically on:

  1. The data integration of the movement data type: any type of data that represent people movement such as housing or companies moving, as well as tourist displacement. The integration is done in two main parts: first a generalized data structure cas defined with a generic data descriptor to allow importing any data set containing movement data ; second an automated data query algorithm has been defined to select suitable movement entry with respect to geographical and temporal area and granularity.
  2. The transfer of the stand alone prototype of the web application, which use mysql and spring technologies, on the hadoop cluster of the xdata project, in particular using spark and hive.