How to create effective custom RFC function to integrate millions of records uper fast with SAP BODS

As an ABAPer, I have the opportunity to support several systems and different teams in every project. SAP offers many tools to integrate SAP with the legacy system. One of such tools is SAP BO Data Services (BODS) which is an ETL (Extract, Transform and Load) tool for SAP. SAP Data Service is a wonderful tool for data integration, data profiling, and data processing. It helps to integrate and transform trusted data-to-data warehouse system for the purpose of analytical reporting. SAP BODS is also used as the management console for scheduling of jobs and it comprises of smooth UI development interface, a metadata repository and data connectivity to source and target system. If you want to learn more about SAP BODS then please let us know in the comments section so that we can plan to create some useful articles on this subject.

We can use SAP remote function calls (RFCs) in queries created in Data Services data flows. In addition, Data Services provides the SAP application BAPI interface to support the use of remote function calls designed for business transactions (BAPIs). We use the SAP application datastore to import BAPI function metadata. SAP functions that are not RFC-enabled can be used in ABAP data flows with the following restrictions:

  • The function can only have scalar, multiple input parameters. The function cannot use table parameters.
  • For the output, you can select only one scalar parameter. Data Services cannot use normal functions in data flows because an SAP application normal function is not an RFC function.

However, we can write a wrapper for a normal function, which can change the function to be RFC enabled. In this case, a normal function which would be in a wrapper RFC FM would be supported in data flows including table parameters.

So what is so special about this? Well, we are talking about loading millions of records to SAP with SAP BODS. Imagine, you have to load 20 millions of sales movements and it has to be the fastest way.

Here is where we get into action. To create an RFC which is going to be called from SAP BODS there are some special considerations:

1. First and foremost, RFC should be remote enabled.

2. All the input parameters should be marked as Pass Value.

3. If you have any table parameters, they need to be defined in the tables section at the time of creation in the T-Code SM37 transaction and marked as optional.

4. The table parameters should be typed as LIKE even if this option doesn’t appear in the drop down.

This kind of configuration (i.e LIKE typing) is marked as obsolete by SAP ABAP help documentation but this does not apply for SAP BODS integration. One very common error which developers face while consuming an RFC from BODS is because the ABAPers create the RFC FM with the formal parameter as Changing typed and it doesn’t work that way from BODS.

5. Define basic Exceptions so that if the parameters are not in the correct format or the table are empty, you can raise an exception.

6. Most of the time when legacy information is integrated with SAP, two phase staging is recommended with high volume interfaces. Just like in inbound IDocs, it is recommended not to process immediately, similarly in BODS, if we directly call the BAPI or Transaction, it can cause an overhead to the system or block other users, so you have to consolidate the information in a Z custom table and every time before you load data you should delete the information using the function truncate table:

This is better than trying to DELETE. Remember we are loading millions of records to SAP, and every day it has to delete the information that was loaded a day before, so if you try to delete a table of 20 million with delete statement is not the best for the system performance.

** Staging is a commonly used term in SAP BODS. It means the place where the information flows before it reaches the final target. For example, when you load information from legacy systems to SAP, you might first save it or stage it in some other DB commonly a SQL, work on it (transform it if necessary) and then persisted in SAP. Afterall SAP BODS is another ETL (Extract, Transform and Load) tool.

****Truncate is better than Delete because it doesn’t generate log in the database and it is faster.

DELETE in LOOP. Is it still a TABOO?

After that, you need to activate it, and notified to the SAP BODDeveloperer to consume your function.

The developer log on screen in BODS side.

You will need to import the RFC function by its name.

Once done this it could be used in any ETL Job in the project. In this case, we used the function for achieving parallelism; calling it from three different workflows.

Each workflow queries a range of information from the legacy system and then using the function loads them into SAP. The input parameters for the function call is defined inside each workflow.

Once every step has green signal, we execute the JOB. For this run, we are going to load 15 million records.

Once the JOB is successfully completed, we can check the target table at SAP to validated if 15.7 million entries were populated. Which the below image confirms.