What is SAP Data services?
SAP BO Data Services is an ETL tool used for Data integration, data quality, data profiling and data processing and allows you to integrate, transform trusted data to data warehouse system for analytical reporting.
BO Data Services consists of a UI development interface, metadata repository, data connectivity to source and target system and management console for scheduling of jobs.
What is a repository in BODS? What are the different types of Repositories in BODS?
Repository is used to store meta-data of objects used in BO Data Services. Each Repository should be registered in Central Management Console CMC and is linked with single or many job servers which is responsible to execute jobs that are created by you.
There are three types of Repositories:
Local Repository: It is used to store the metadata of all objects created in Data Services Designer like project, jobs, data flow, work flow, etc.
Central Repository: It is used to control the version management of the objects and is used for multiuse development. Central Repository stores all the versions of an application object so it allows you to move to previous versions.
Profiler Repository: This is used to manage all the metadata related to profiler tasks performed in SAP BODS designer. CMS Repository stores metadata of all the tasks performed in CMC on BI platform. Information Steward Repository stores all the metadata of profiling tasks and objects created in information steward.
What is single object and reusable objects in Data services?
Reusable Objects − Most of the objects that are stored in repository can be reused. When a reusable object is defined and save in the local repository, you can reuse the object by creating calls to the definition. Each reusable object has only one definition and all the calls to that object refer to that definition. Now if definition of an object is changed at one place you are changing object definition at all the places where that object appears.
An object library is used to contain object definition and when an object is drag and drop from library, it means a new reference to an existing object is created.
Single Use Objects − All the objects that are defined specifically to a job or data flow, they are called single use objects. Example-specific transformation used in any data load.
What is a real time Job?
Real-time jobs “extract” data from the body of the real-time message received and from any secondary sources used in the job.
How do you manage object versions in BODS?
Central repository is used to control the version management of the objects and is used for multiuse development. Central Repository stores all the versions of an application object so it allows you to move to previous versions.
What is the template table?
In Data Services, you can create a template table to move to target system that has same structure and data type as source table.
What is SAP Data Services Designer? What are main ETL functions that can be performed in Designer tool?
It is a developer tool which is used to create objects consist of data mapping, transformation, and logic. It is GUI based and work as designer for Data Services.
You can create various objects using Data Services Designer like Project, Jobs, Work Flow, Data Flow, mapping, transformations, etc.
What are the different types of files can be used as source and target file format?
- Delimited
- SAP Transport
- Unstructured Text
- Unstructured Binary
- Fixed Width
What is the use of data flow in DS?
Data flow is used to extract, transform and load data from source to target system. All the transformations, loading and formatting occurs in dataflow.
What are different objects that you can add to a dataflow?
What are the different properties that you can set for a data flow?
- Execute once
- Parallelism
- Database links
- Cache
Why do you use work flow in DS?
Workflows are used to determine the process for executing the workflows. Main purpose of workflow is to prepare for executing the data flows and to set the state of system once data flow execution is completed.
What are the different objects that you can add to work flow?
- Work flow
- Data flow
- Scripts
- Loops
- Conditions
- Try or Catch Blocks
What is the use of conditionals?
You can also add Conditionals to workflow. This allows you to implement If/Else/Then logic on the workflows.
What is a transformation in Data Services?
Transforms are used to manipulate data sets as inputs and creating one or multiple outputs. There are various transforms that can be used in Data Services.
What are the common transformations that are available in Data Services?
- Data Integration
- Data Quality
- Platform
- Merge
- Query
- Text data processing
What are different transformations under data integration?
- Data_Generator
- Data_Transfer
- Effective_Date
- Hierarchy_flattening
- Table_Comparision, etc.
What is text data processing transformation?
This allows you to extract the specific information from large volume of text. You can search for facts and entities like customer, product, and financial facts specific to an organization.
This transform also checks the relationship between entities and allows the extraction.
The data extracted using text data processing can be used in Business Intelligence, Reporting, query, and analytics.
What is difference between text data processing and data cleansing?
Text data processing is used for finding relevant information from unstructured text data however data cleansing is used for standardization and cleansing structured data.
What is a real time job in Data Services?
You can create real time jobs to process real time messages in Data Services designer. Like a batch job, real time job extracts the data, transform and load it.
Each real time job can extract data from a single message or you can also extract data from other sources like tables or files.
What is an embedded data flow?
Embedded data flow is known as data flows which are called from another data flow in the design. The embedded data flow can contain multiple number of source and targets but only one input or output pass data to main data flow.
What are the different types of embedded data flow?
One Input: Embedded data flow is added at the end of dataflow.
One Output: Embedded data flow is added at the beginning of a data flow.
No input or output: Replicate an existing data flow.
What are local and global variables in Data services job?
Local variables in data services are restricted to object in which they are created.
Global variables are restricted to jobs in which they are created. Using global variables, you can change values for default global variables at run time.
What are the different recovery mechanism that can be used in failed jobs?
Automatic Recovery – This allows you to run unsuccessful jobs in recovery mode.
Manually Recovery – This allows you to rerun the jobs without considering partial rerun previous time.
What is the use of Data Profiling?
Data Services Designer provides a feature of Data Profiling to ensure and improve the quality and structure of source data. Data Profiler allows you to −
Find anomalies in source data, validation and corrective action and quality of source data.
The structure and relationship of source data for better execution of jobs, work flows and data flows.
The content of source and target system to determine that your job returns the result as expected.
What do you understand by multiuser development in BODS? How do you manage multiuser development?
SAP BO Data Services support multi user development where each user can work on application in their own local repository. Each team uses central repository to save main copy of an application and all the versions of objects in the application.
Please enter a title attribute
SCDs are dimensions that have data that changes over time.
Need to learn