Machine Learning Engineer
Location: Tokyo, Tokyo JP
Requisition Number: 204069
Position Title: PS Consultant (Mid Level)
Machine Learning Engineer
As a Software Data Engineer, you will provide technical leadership to clients in a team that designs and develops path-breaking large scale cluster data processing systems. You will mentor sophisticated organizations on large scale data and analytics and work with client teams to deliver results.
Additionally, as a member of our Consulting team, you will help Teradata Consulting establish thought leadership in the big data space by writing white papers, technical commentary and representing our company at industry conferences.
Secondary Responsibilities for the Software Data Engineer
You will design and develop code, scripts and data pipelines that leverage structured and unstructured data integrated from multiple sources. Software installation and configuration. You will participate in and help lead requirements and design workshops with our clients. Develop project deliverable documentation.
- Proven expertise in production software development
- 3 years of experience programming in Java, Python, SQL, or C/C++
- Proficient in SQL, NoSQL, relational database design and methods for efficiently retrieving data
- Strong analytical skills
- Strong team player capable of working in a demanding start-up environment
- Experience building complex and non-interactive systems (batch, distributed, etc.)
Preferred Knowledge, Skills and Abilities:
- Experience with Hadoop, Hive, Pig, Avro, Thrift, Protobufs and JMS: ActiveMQ, RabbitMQ, JBoss, etc.
- Dynamic and/or functional languages (e.g., Python, Ruby, Scala, Clojure)
- Experience designing and tuning high performance systems
- Prior experience with data warehousing and business intelligence systems
- Linux expertise
- Prior work and/or research experience with unstructured data and data modeling
- Familiarity with different development methodologies (e.g., agile, waterfall, XP, scrum, etc.)
- Broader experience with spring ecosystem including spring-batch, spring-mvc, and spring-hadoop
- Standards-based REST implementation
- Configure a Jenkins build, create/update a Jira ticket, enable Automated Tests in gradle/maven build
- Vagrant, Docker
- Familiar with OSI stack and proper use of HTTP verbs etc.
- Hive Tuning, Hive physical design (file formats, compression, partitioning, bucketing), Hive DSS Queries, Hive for Data Science, Hive for running R/Python (Streaming), Hive Transactions, Hive-HBase
- Knows how to tune a job including parameters, more efficient API calls. Understands Spark SWL. Understands Spark Streaming and can transform a DStream.
- Ability to write advanced UDFs, Serdes, input-loaders, log analysis, how the logical operators map to the lower, level physical implementation
- Setup and leverage output from Ganglia, Nagios, Ambari, Cloudera Manager, etc.
- Able to link Kerberos KDC to a backing LDAP or Active Directory authentication provider
- Advanced understanding of access-driven key design, appropriate denomalization, use of co-locating records of differing schemas in a single table etc.
- Understanding of best practices for Hive schemas. Denormalization, partitioning and bucketing, file formats.
- Able to create, write to and read from Kafka topic. Understanding of key partitioning (just how it works), able to maintain an offset in the topic for consistent reading
Must be able to travel to client sites at least 50% of the time. Must be able to interact and communicate with the client in meetings. Must be able to write programming code in applicable languages. Must be able to write project documentation in English.
Bachelor's Degree or foreign equivalent in Computer Science or related technical field followed by six (3) years of progressively responsible professional experience programming in Java, Python or C/C++. Experience with production software development lifecycle. Experience with Linux, SQL, relational database design and methods for efficiently retrieving data. Experience building complex and non-interactive systems (batch, distributed, etc.).
Master's Degree or foreign equivalent in Computer Science or related technical field. Four (3) years of experience programming in Java, Python or C/C++. Experience with production software development lifecycle. Experience with Linux, SQL, relational database design and methods for efficiently retrieving
Community / Marketing Title: Machine Learning Engineer
Job Category: Consulting
With all the investments made in analytics, it’s time to stop buying into partial solutions that overpromise and underdeliver. It’s time to invest in answers. Only Teradata leverages all of the data, all of the time, so that customers can analyze anything, deploy anywhere, and deliver analytics that matter most to them. And we do it at scale, on-premises, in the Cloud, or anywhere in between.
We call this Pervasive Data Intelligence. It’s the answer to the complexity, cost, and inadequacy of today’s analytics. And it's the way Teradata transforms how businesses work and people live through the power of data throughout the world. Join us and help create the era of Pervasive Data Intelligence.
Location_formattedLocationLong: Tokyo, Tokyo JP