As documents travel through boomi processes, the documents get enhanced. More fields get added. You can do this either by enhancing the profile and use a map to enrich the data or add the additional fields as properties of a document. Programmers also try to reuse certain processes like one does in a programming language. Should one rely on profiles for the reuse or the properties of a document to process them? Also if you were to have a 30,000 documents and each document is 10 to 50k in size, what is the impact of set properties? Also is it better off forwarding only the documents with minimum set of data or just pass the original set even it is large? As I know more I will document some trade offs and a few thoughts
satya - 11/17/2016, 9:58:21 AM
Observations on setting document properties to performance
Casual observation of log files seem to indicate setting document properties is taking time. For example for 30k of EDI documents I have seen 3 minutes.
The time may vary with the size of the individual documents and also collectively. These numbers have to be seen.
If the process is long with with many sub processes and hundreds of shapes the number of set properties could add up.
set properties could also be wrongly used where most of the connectors are poring the data into set properties as opposed to connector calls and then merging the data back with the main document in to an enhanced profile with more fields. I suspect this is a better approach instead of relying on set properties. connector calls seem to be significantly faster than calling out of a set properties or a map.
satya - 11/17/2016, 1:46:45 PM
set properties vs adding as fields to the document
1. Call outs from set properties to connectors appears to be less efficient. How much less? I am not sure of the numbers.
2. If there is only one field to be added, may be having a map and a document cache might be an overkill
3. if there are 5 fields from an sql that need to be added, it may be better to have the connector call the fields and map them back. Possibly you may have to call that SQL 5 times to set fields individually. May be there is a way to spray the fields from a single SQL call as multiple document properties (I don't think so!)
4. Even at times having the set properties pull the multiple values from a doc cache and set them as document properties
satya - 11/17/2016, 1:49:16 PM
set properties and reusable sub processes
Some times developers are using sub processes to provide common logic based on document properties solely and not the document itself.
Such an approach is misleading in my opinion. Better way is to define an explicit data structure (profile) that the sub process knows and declares that it can handle and have the caller process map to the needed profile to achieve that.
This could possibly avoid carrying large documents across a sub process if they don't have to be.
satya - 11/17/2016, 1:58:02 PM
Set properties and large size documents
It is expensive to carry thousands of large size documents across shapes and sub processes. This at much is clear.
So if you were using set properties and relying on the set properties to get your job done you are unnecessarily carrying the large documents through when you are depending meta data like the set properties to do your work.
One option is to create a smaller profile and move your properties into that smaller profile and move that smaller document set across.
There is one case where set properties might make sense, where you have to set one or two fields and using a map instead might make the process take longer as that takes copying a large document field by field to the new document format.
Carry your work with a least size set of documents if you can. Don't let a very large size documents carry your properties when the documents themselves are not used!
satya - 11/17/2016, 2:00:46 PM
It (may) be ok to set properties when the documents are large
if the documents are large mapping them with a new field by repeating the rest of the fields could be time consuming.
In such cases adding one or more fields through document properties might leave the document intact and add the fields on top.
There may be a compromise.
Even here, if the number of properties increase, you may want to consider doing through a connector and use a doccache to pull the property values.
satya - 11/17/2016, 2:01:58 PM
A large number of set property shapes in a process shape may be a sign of bad design
not sure, may be. suspicion!
satya - 11/17/2016, 2:05:59 PM
what you might be doing wrong
1. Passing large documents through an entire process when a number of unique smaller documents may have worked.
2. Using large documents to carry the document properties as the main processing fields.
3. Not using maps to sub select or super select your profiles as they go along and instead using set properties.
4. Designing sub processes that use document properties as their primary input when it should be profiles.
5. Not using connectors to pull values to be used either as set properties or as inputs to a map.
6. Not using smaller documents
satya - 11/19/2016, 10:25:13 AM
Timing on setproperties shape for 30K of docs
//No callouts, but a simple static value
//For setting 1 property
Small size: 6 seconds
Med/Large size: 17 seconds
//For setting 10 properties
Small size: 15 seconds
Med/Large size: 30 seconds
satya - 11/19/2016, 10:28:06 AM
Timing for getting through a groovy shape for 30k
//Groovy with a for loop
//setting the docs back out
small size: 7 seconds
Med/large: 18 seconds
//Groovy with NO for loop
//just setting a process property
small size: 2 seconds
Med/large: 2 seconds