OpenRefine on Hadoop, Lead Developer
SpazioDati - Trento
We are looking for a reliable, passionate and creative Java and Hadoop developer to help us bring OpenRefine to the next level.
Our task within a funded 2-year innovation project, in collaboration with top business and research institutions worldwide, is to develop a massively scalable, parallel, batch execution engine for OpenRefine based on Apache Hadoop (and similar technologies). The engine will be released as Open Source and become part of our own production data ingestion, enrichment & curation pipeline, which powers our data marketplace dandelion.eu.
Reporting to the Director of Engineering, you will work with our passionate and extremely intelligent Engineering Team to:
- develop an integrated, scalable, reliable and high-performance system
- lead the interaction with the other project partners and with the OpenRefine community
Since the result of this project will be a crucial component of our big data pipeline, this position provides you the perfect opportunity to grow, develop and enhance your career as we continue to build a great company.
A successful candidate will have:
- Proven ability with efficient handling of large volumes of data
- Proven experience in developing applications on Apache Hadoop or similar Open Source bigdata technologies (e.g. Storm)
- Outstanding Java programming skills
- Experience with Hadoop, HDFS data design for performance and scale, RDF, SPARQL, and other Linked Data principles
- Rock-solid Computer Science background
- Strong willingness to learn and share knowledge with the rest of the team
The following are a plus:
- Working knowledge of a scripting language (preferably Python)
- Past contributions to Open Source Software projects
- Experience developing in environments using Agile / CI / Git
- Msc or PhD in technology/science related disciplines
- Background on Machine Learning & AI
- Basic understanding of Italian language or willingness to learn
We practice Agile/Scrum methodology and most of our software is developed in Python and Java. Our infrastructure is based on AWS and our stack includes software such as Ubuntu, Nginx, Hadoop, Titan Graph DB, Cassandra, Openlink Virtuoso, OpenRefine. Our software development is supported by tools such as Vim, Sentry and Jenkins.
We offer an informal working setting and flexible working time. The office site is Trento but in exceptional cases we may consider remote work. The salary is negotiable and we offer benefits and incentives.
Agency resumes are not accepted and will be considered unsolicited resumes that are not subject to placement fees.
ABOUT SPAZIODATI
We are the first big data startup in italy. We have very ambitious objectives and a visionary strategy based on the know how of an international research group. Our management team has many previous successuful experiences in web companies and startups. We are backed by solid investors. We work hard and have a lot of fun!
HOW TO APPLY
Send us your CV (no .doc files plz!) and a short informal introduction about yourself.
Please don’t forget to include links to projects you contributed to or some code we can look at.