Opis oferty
- 2+ years developing software using Java or Scala in Linux environment
- 2+ years Big Data Technologies – Cloud and/or On-Prem. Hadoop, Spark, MapReduce, AWS expertise
- Linux, shell scripting
- Good coding with at least one of: Java, Scala or Python
- SQL skills
- Strong analytical and troubleshooting skills
- Bachelor’s degree in computer science or related field (nice to have)
- Professional working proficiency in English (both oral and written)
Opis oferty
💰 12 000 – 20 000 PLN net/month B2B
📠B2B or employment contract
🌎 100% remotely or Wroclaw office (you can mix)
💻 Comscore is a media measurement and analytics company providing marketing data and analytics to enterprises
📌 Work for customers such as Disney, Amazon and Facebook
Comscore is looking for Data Engineer.
We are a leading cross platform audience measurement company. We measure what matters to make audiences and advertising more valuable. We collect and process billions of events each day, have 10s of petabytes online and each month our processes read nearly an exabyte. We use this capability to provide our clients with unlocked insights in their data. This job involves designing and maintaining big data web & engineering solutions capable of running efficiently and robustly at petabyte scale.
âï¸ We offer:
- Real big data projects – working with data for big players on the market
- International team (working with the best specialists around the globe)
- Hands-on all solutions and decisions within the team, your ideas can go to production
- Small, independent team working environment (no crazy meetings, no big management structure, the team make rules and ship the software)
- Working on product (no project switching every 3 months) - we believe such approach benefits both sides the most, where you can fully focus on specific tech stack, specific product, specific problems
- Private healthcare, MultiKafeteria
- Flexible work time - if you need to get something done around town – no problem
- B2B/Employment contract/Contract of mandate – we’re up for whatever type of agreement fits you most
- Fully remotely or in the office (Wroclaw, Szewska 5) - you can choose and mix if you want
Zakres obowiązków
- Define data models which tie together large datasets from multiple data sources (terabytes of data, hundreds of terabytes)
- Design, implement and maintain Data pipelines and data driven solutions (using python or java/Scala) and Linux/AWS environment
- Buidling data pipelines using Apache Airflow, Spark, Hadoop or whatever tool will be appropriate. We are researching some of the technologies as well, and of course we have our own processing engine that handles huge volumes for very specific processes that couldn’t be done with existing solutions
- Design and implement solutions using various databases with very specific flows and requirements. Apache druid.io, AWS S3, Cassandra, ElasticSearch, Hadoop/HDFS/HBase and more
- Optimize – working with big data is very specific, sometimes it’s IO/CPU/network - bound, depending on process, and we need to figure out faster way of doing things. At least empirical knowledge of calculation complexity, as in big data, even simple operations, when you multiply by the size of dataset can be costly
- Collaborate with DevOps and other teams to sustain solutions and integrations smooth
- Coordinate with external stakeholders to deliver on product requirements