The data archive engineer will work with IT database professionals and with application groups to identify opportunities for archiving data from operational databases. Types of databases that are expected to be source of data are DB2, Oracle, MS SQL Server, Sybase, Postgres and other relational databases on distributed systems. Archiving may also be required from nonrelational legacy databases. The data archive engineer will assist archive business analyst in preparing the business case for potential applications. The analyst will define the appropriate requirements and archiving strategy for each application. After an application is approved for implementation, the data archive engineer will identify operational data sources that will be used for acquiring data for the archive and associated metadata that describes the data. The data archive engineer will analyze the data and metadata to determine it's accuracy and completeness. Metadata will be enhanced to overcome shortcomings. The data engineer will design data structures that will be extracted for archiving and their archive representation. Policies for when to extract data and when to discard data from the archive will be developed. A storage policy will be created that considers long-term retention requirements, data loss protection and cost. The data engineer will be responsible for the proper scheduling and execution of database archiving tasks. The analyst will also provide assistance to users in developing access to information in the archive. This will include finding the relevant archive structures, interpreting the metadata, and possibly formulating access queries. We seek smart, focused, passionate self-starters who bring energy, new ideas and practical experience to a fast-paced and dynamic team, are obsessed with delivering useful and elegant solutions, and care deeply about their customers and each other.
Job description
The data archive engineer will work with IT database professionals and with application groups to identify opportunities for archiving data from operational databases. Types of databases that are expected to be source of data are DB2, Oracle, MS SQL Server, Sybase, Postgres and other relational databases on distributed systems. Archiving may also be required from nonrelational legacy databases.
The data archive engineer will assist archive business analyst in preparing the business case for potential applications. The analyst will define the appropriate requirements and archiving strategy for each application.
After an application is approved for implementation, the data archive engineer will identify operational data sources that will be used for acquiring data for the archive and associated metadata that describes the data. The data archive engineer will analyze the data and metadata to determine it's accuracy and completeness. Metadata will be enhanced to overcome shortcomings.
The data engineer will design data structures that will be extracted for archiving and their archive representation. Policies for when to extract data and when to discard data from the archive will be developed. A storage policy will be created that considers long-term retention requirements, data loss protection and cost.
The data engineer will be responsible for the proper scheduling and execution of database archiving tasks. The analyst will also provide assistance to users in developing access to information in the archive. This will include finding the relevant archive structures, interpreting the metadata, and possibly formulating access queries.
We seek smart, focused, passionate self-starters who bring energy, new ideas and practical experience to a fast-paced and dynamic team, are obsessed with delivering useful and elegant solutions, and care deeply about their customers and each other.
Responsibilities include, but are not limited to, the following:
- You will design and architect solutions based on OpenText InfoArchive – helping our application owners to identify the information they can archive and the process to archive
- You will help to design the configuration required on InfoArchive in order to meet the data retrieval and compliance requirements of our application owners
- You will help to grow our knowledge and capability in areas of Information Governance including addressing GDPR and other regulations both in the US and internationally.
- Ideally you will bring experience of delivering solutions using OpenText InfoArchive
- You will utilize Archon extraction tool to select, extract, reformat, and load data to InfoArchive, assuring Chain of Custody is maintained.
- You will map an InfoArchive structure to source application data for each archived application (one time or ongoing) based on identified requirements
- You will administer and support the InfoArchive environment.
- You will utilize your knowledge and experience of Information Modelling and good knowledge of Relational Database Management Systems
- You will implement Information Lifecycle Management including Retention Management and Disposition
Education:
- BS/BA degree in scientific, business or technology related field.
Experience:
- Active working experience (3-5 years) in a pharmaceutical, healthcare, IT, or related
Industry Skills
- Should have experience in InfoArchive shell, Submission Information Packages (SIP) SDK, InfoArchive REST architecture, InfoArchive gateway/Web Applications.
- Thorough understanding of concepts of table archiving, data record archiving, file archiving and compound business record archiving.
- Experience in shell/bash scripting, create & installation of archival applications, create retention, search, configurations using encryption and masking in data ingestion and search.
- Experience in Java & J2EE programming in the software development and good experience in HTML/Java script and other web technologies.
- Experience in developing a custom ETL plugins/using any other 3rd party ETL tools used for SIP creation & ingestion.
- Experience in developing the InfoArchive REST API based clients to automate the activities such as data ingestion, extraction and application and other system resource creations etc.
- Knowledge of xDB database and it’s configuration & integration with InfoArchive along with XQuery development in the search forms creation for Table archival applications.
- Knowledge of system architecture design, planning and utilizing physical and virtualized environments.
- Develop and maintain scalable ETL/ELT jobs in Postgres SQL with various sources and targets like SQL, Oracle, DB2, SAP HANA, Hadoop, Cloud, S3, ftp, sftp, etc.
- Provide technical assistance to team and evaluate jobs
- Ability to work independently with minimal supervision
- Develop mapping document and maintain sufficient documentation on mapping rules.
- Perform unit testing, volume testing and end-to-end integration testing.
- Work closely with Technology Leadership, Product Managers, and Reporting Team for understanding the functional and system requirements
- Work closely with our QA Team to ensure data integrity and overall system quality
- Knowledge of RDBMS concepts and hands on experience in working on oracle and SQL server – Structural/ Unstructured database, query management experience.
- Good team player with effective motivation skills, collaboration, and prioritization skills.
- Strong planning and analytical skills combined with ability to work with geographically distributed teams in a rapid development model.
- Excellent problem solving, planning, and organizing skills and flexible approach.
- Excellent technical topics in a manner non-technical stakeholder can easily understand.
- Operating systems: Linux and/or windows
- Web and application servers: Apache Tomcat etc.
Good to Have:
- Sound knowledge on application sizing and fair knowledge on HW infra.
- Good understanding of internet protocols, network, and application security methods.
- Experience in enabling the SSO configurations with in InfoArchive platform.
- Knowledge working in ECM (Enterprise Content Management) like Documentum, OpenText Content server etc.
- Experienced in working on could infra such as AWS, Google Cloud, Azure etc.
- Experience in ETL tools like Archon, Informatica, SSIS, etc.