The world is complex. Therefore the structure (who, what, where) of many organizations is complex. Therefore the processes (how and in which order) in many organizations are complex. When managers think and deliberate about problems in organizations they like to rely on schemes. A clear picture can summarize the complex structure and the processes of an organization in a clear way. All kinds of drawings that describe the structure or the processes of an organization are called models. I concentrate mainly on process models: the drawings that describe a series of followed actions or actions to be followed. To be correct, I have to admit that a model isn't always a drawing. It is a representation of reality and therefore it could also be a piece of text. But mostly a model is a graphical representation.
As I already said. The world is complex. Therefore the structure of many organizations is complex. Therefore the processes in many organizations are complex. And therefore many process models are wrong. They are incomplete, not up to date or sometimes just wrong. The reason is that models are mostly made by people and people mostly make mistakes in complex situations (at least I do). Therefore a technique is developed to generate process models automatically by a computer program. In this way faster, better and more complete process models are made. This technique is called process mining. Many computer programs log information about their actions. This information could be useful to analyze afterwards where things went wrong. But this information can also be used to reveal events and their order. We make a nice drawing of these events and end up with a process model.
The files where computer programs keep ('log') such information are called computer log files. It works just like a black box in a plane or the log book on a ship where the captain occasionally writes down the condition of the ship and where also every incident is logged. In the same way computer programs log what they do and what goes wrong. Mostly this information contains a timestamp, a username and a message.
I would like to reveal a part of process mining. How does this information get converted to process models? You should realize that normally a process is executed multiple times. In the context of a ship there are multiple journeys, in a sales process there are multiple sales, in a work application process there are multiple applications, etcetera. In all log files we mostly find the same order of events multiple times (from embarkation to debarkation, from sales offer to booking invoice payment, from writing a job offer to the new employees first day). In this way we discover a fixed order of events in a process and we can make a drawing with a square per step and edges between the squares. Note that we search repeating orders of steps without needing to know what the steps mean.
Sometimes we discover in one execution a certain order of steps and an other time we observe a different order. But if each time one of this two possible ways is followed we can add this to our drawing. We then draw multiple ways from start to end. Sometimes real exceptions are logged and we should consider if we would like to include them in our drawing. Sometimes there are errors in the logging itself. We should try to avoid that these error influence our drawing. This are just some - but important - focus points of our research in process mining...
Not everything in an organization is supported by the same computer program. A sales offer mail is not sent with the same program where you book an invoice. The invoice is made with yet another program. Because the different steps in a process are mostly supported by different computer programs, you probably wouldn't find all information in one log file. Of course there is a trend of combining as much as possible in the same program. Some software offers a wide range of possibilities (e.g. SAP). Yet there will always be steps that are recorded by other programs (e.g. mailing). Some steps will even not be logged by a program (e.g. telephone call). To get the maximum out of our technique, In my PhD I want to try to combine the information of multiple log files (from multiple programs) into one more complete process model.
This is off course not that easy. There's no uniform way of making computer log files and these log files aren't made with the purpose of process mining. Each computer program logs in its own way, with its own focus, with its own goals and by consequence with other information. Merging this information from different sources is a hard job. The computer programs not only probably use a different name for the same thing (e.g. user, person, owner, ...), there's the possibility that logging happens on different levels (e.g. one program logs an account being created, another program mentions appointment of a user name, password and e-mail address). Information also can partially or fully overlap (e.g. accounting keeps records of payment of an invoice, but you can find this also on bank account extracts).
As a researcher off course I always try to think one step further. Imagine I indeed succeed in combining log files, aren't there any other benefits? Of course there are. This is not necessarily limited to multiple log files from one organization. Some collaborating organizations can put their data together for analysis of the overall process. If an organization wants to take over another organization and has access to its data, it can bring together their data and try to discover the overall process or consequences of changes. If an organization has multiple business units, it could merge their data to analyze deviations in the process. If the government could call for all this data in a certain domain, it could model the whole domain and look for consequences of a change in one organization for the domain (e.g. food industry, automobile industry, banking, ...).
I'm hoping to have given you a clear image off what I'm doing. If you are interested, but didn't understand what I explained, please let me know. Not only I will try to help you understand, you will also help me to point out which part of the text is difficult. If you are no researcher, please also don't hesitate to ask. Your input is perhaps the most important because people who are not familiar with these things may ask the most relevant questions (e.g. 'all right, but is anyone waiting for this'). You can find my contact info here.
Click on Play. Wait for the Prezi to load. Click Fullscreen for a bigger representation. Use the button with the right arrow to browse through the presentation or click on objects to navigate yourself.