Feed aggregator
Towards proactive enterprise intelligence by Gregoris Mentzas
I came across a recent presentation given by Gregoris Mentzis (from NTUA, Greece) entitled "towards proactive enterprise intelligence". In this presentation Gregoris discusses some research challenges.
The capabilities of proactive enterprise intelligence are defined in slide 21 and seem similar to our definition (I also recognized the pictures). I'll write more about the two patterns expressed in this slide.
Reading this presentation is recommended. Enjoy!
Driving while looking at the rear-view mirror
Typically I don't recommend commercial Blogs, but I'll make an exception this time, and cite Mark Palmer's post, since I like the metaphor he made, reflected in this picture. According to Mark, analytics that refers to the past is like driving by only looking at the rear-view mirror, which of course can show the road you have already passed. Since typically we drive forward and not backward it should be more useful to look ahead than to look back. In many cases the road is fixed, in the sense that the road forward looks exactly like the road backward, and then it might make sense to do it, however, in other cases, like driving in real traffic and not in a bubble, the road ahead may contain surprises that are not evident from the previous parts of the road. The interesting thing is that even law latency event processing system are, in fact, looking at the past, where the past is almost the present, looking to the future is not a standard event processing feature.
On the city data management workshop in CIKM'12
I am one of the co-chairs of the "City Data Management 2012" workshop in CIKM'12 The work on smart cities have been emerged in the last few years. The data management aspect of smart cities is one of the major topics, due to the need to acquire significant amount of data, much of it streaming data, and perform monitoring, search, query and various analytics. The workshop focuses on the several aspects of city data management. The call for papers is now out, the interesting part for me is the data monitoring part, which consists of the following points:
- Complex Event Processing for Smarter Cities
- Anomaly detection and prevention
- Forecasting city events
- Event-based optimization for adaptive city operations
- City process monitoring
If you are in this area -- consider submission. Note that the conference will take place in Maui, Hawaii.
On the right education for enterperneur
I came across an interesting discussion on the topic what is the right education for entrepreneurs.According to the study cited in this article, most of the entrepreneurs don't have degrees in computer science or engineering but in various other disciplines. This echos the debated between Bill Gates who thinks that engineering education is the only one matters, and Steve Jobs who said that in Apple the DNA is mixed between engineering and liberal arts.I have some perspective on this issue since I have enjoyed a diversified education. I did BA degree in Philosophy, then went for MBA studies and learned the basic of business concepts, and then did PhD in Computer Science. More than 10 years ago I participated in the "basic blue" management course of IBM, this was a "private course" given for new managers in IBM Haifa Research Lab, since due to organizational changes, and addition of one more level of management, many new managers were appointed within a short period of time to justify bringing the course to us. As a preparation of the course, the participants fill a substantial questionnaire, and there are also 360 view questionnaires form managers, peers and direct reports. When showing the results the course's instructor said that he was surprised to see that 16 of the 18 participants had very similar personality profile. The common denominator: all of them studied in the Technion computer science or computer engineering, I was one of the two exceptions. While generalizations never properly work, it seems that there is something in the engineering education that drive people to concentrate on the technical details and see their challenge in doing strong technical work, there is nothing wrong with it, any company small or big in the technology domain requires strong technical people.
Entrepreneurs require something beyond the technical skill, they need the passion to change the world, to believe that they are doing the "next big thing". This is another mode of thinking. I have never been an entrepreneur (perhaps it is still not to late), but I have written about intrapreneuring (trying to do "start-ups" within a big corporate) before. If I need to reflect upon my past studies and see what contributed most to me, I view my studies as complementary. The MBA studies contributed understanding of the basics of the business concepts, the computer science studies contributed to the understanding of technology and the content of what I've been doing, but the passion to change the universe, and the contribution to thinking out of the box was definitely been of the philosophy studies. In retrospect they were most influential on shaping my personality.
I am not surprised that many entrepreneurs are not engineers, there is something inherent in engineering education that makes people think in boxes.
What’s Behind CEP’s Leaps and Bounds? Automation
More on the layered approach of the event-driven world
I have been asked by several people to write in more detail about what I meant by the layered approach, so I thought it should be better illustrated in an architecture diagram (converting it to picture did not result in high quality, somehow). The rationale behind it that event processing became pervasive in use, with both event processing products, and "build your own solution". In the same way that application servers based on standards like J2EE contributed to building of certain type of web-based enterprise applications, an application server for event-based applications is required to provide services from various types: from context service, adapters to sensor and mobile platforms, dashboard, management services, meta-data service and more. The second layer is the agent layer which provide both directory of agents and tools to build your own agent. The application is the third layer. This architecture can provide independence, and ability to get best of breed in both services and functionality components from different sources, and combinations of "build" and "buy".Standards are of course the key to make it happen.
Twitter Cannot Predict Elections Either
On robots for the elderly
Fujitsu Releases Software Supporting the Utilization of Big Data
A 100-Gigbit Highway for Science
The Event driven world - a layered approach
Sensors and social media are becoming the major sources of data, while the traditional organizational data does not grow in the same rate. Many information system both in the enterprise level and in the individual level are being based on streaming data - these are all old news. I have recently written about event server as the 21st application server, following TIBCO's statements. Progress Software in its recent statement also talked about cloud-based middleware that is geared towards real-time analytics, within its new re-structuring.
Currently, products and solutions in this area provide a mix of different things. I envision that we'll settle in three tier architecture, with different products over the food chain that might be more specialized:
The event application server layer:
This will provide services like:
- sensor and actuator framework, including adapters, data normalization, de-duplication and more.
- routing services - pub/sub, channel management .
- state services -- access to external stores (databases, files), global state, local state, state machine services
- meta-data repository services -- representation of events, relationships among events, and other meta-data entities.
- event flow services -- ability to devise event flows ("event processing networks") from the "programming in the large" point of view (not the implementation of specific agents), this will provide API for event producing, event routing and event consuming.
- Tuning and scalability services -- ability to tune applications by using various ways of parallelism and distribution, fast routing,run-time scheduling and load balancing,
- tracking and provenance services
- context services
- management services
- visualization --- dashboard and visual analytics for feedback services.
- recoverability services
- high availability services
- security and privacy services
The application service layer will be based on event-driven agent-oriented architecture, but will also support request-driven access to various components. This will support large quantity of lightweight agents, and the flow among them. While this is quite different from the way most people think about programming, it will become one of the pervasive programming models -- so it is a good investment for the future career of professionals to understand it.
The agent building environment:
The second layer is the layer that assists in constructing event processing agents. One way of using the event application server is to use the API and implement the logic within traditional programming languages, this will require standards like JDBC to access events and event operators in the same way that we currently issue SQL queries inside programming languages. The other way of doing it is to use agent building tools that generate code for agents, these might have benefits of resusability, optimized implementation, and reduction in cost. This setting will allow hybrid application -- an application may consist of filter agent that is written in Java, using the assertion operator API, aggregation agent taken from a vendor that specializes in optimized statistical functions, and pattern matching agent taken from another vendor who specializes in some type of patterns (e.g. spatio-temporal patterns).
The application layer:
Applications will be constructed on top of the first and second layer. It may be all developed using the second layer supplied by a single vendor, multiple vendors, home-made, or any combinations. In some cases the applications will be developed by a specific user, in other cases it will be developed by independent software vendors specializing in a certain domain.
In this food chain we'll be able to see different vendors -- those who specialize in the application server layer and compete on the quality of services provided there; vendors who specialize in the agent building environment and provide filtering, transformation, aggregation, pattern matching, some of them maybe specialized for specific application type or domain; the third type of vendors will be application providers, which typically require subject matter expertise.
Some of the big vendors will attempt to be active in all three layers or create partnership.
Back to the Holocaust day, my father and King Alfred
I typically don't publish the same post twice, but today, due to the holocaust day, I am returning to something I published in this blog 3 years ago (and got many reactions upon it), the reason is that according to the statistics, during these 3 years the readers population for this Blog has grown significantly - so while old readers might remember it, I also want to share it with the newer ones, This day is very much associated in my mind with my later father who was the only survivor of a big family. Here is a picture of my father late in his life:
I don't have any picture of him until after the war, but here is a picture of the kitchen the Lodz Ghetto, where he worked at the beginning of the war (he said that he might be the person in the middle, but was not sure about it)
My father survived the war, while his parents and 7 brothers and sisters did not.He never talked much about this period in life, saying that this was on another planet, and cannot be described. Once he told me that immediately after the war a relative found him in the survivors list and send him a letter asking him to describe whatever occurred to him during the war, my father answer was:
I will not tell you about what happened to me in the war, instead I will tell you the story of King Alfred
King Alfred has escaped due to some revolt and was hiding in a farm, when the rebelling soldiers looking for him got there, the farmer hid the King below a big pile of straw; the soldiers started to look at the straw, they nearly removed all of it, and then decided to move on. At later time the King succeeded to overcome the mutiny and returned to his throne. At some point the farmer came to visit him and he said to him -- you save my life, I can give you whatever you wish, the farmer said: I am a modest person, don't need anything, have one question to you, what did you feel when the soldiers almost got all the straw removed. The king has shown an angry face and said: this is a very rude question, hang this man immediately. The farmer was about to be hanged, a rope was already tightened to his throat, and then the king said: stop, now you see what I felt.
On temporal extension to SQL:2011
I have written before about the recent return to the bi-temporal databases, in conjunction with DB2. In the 1990-ies was the first attempt to create bi-temporal extensions to SQL, at that time there was a language war, some of it is reflected in the book that I have co-edited, published in 1998. Now after some attempts, SQL:2011 does include support in bi-temporal databases. The terminology was changed from the original terms. What was called in the original version - "valid time" is called in the SQL version "application time', an what was called in the original version - "transaction time" is called in the SQL version "system time".
More details about the SQL extension can be found in the overview presentation that Craig Baumunk uploaded to slideshare. As I have written before, temporal database is vital for maintaining historical events, and thus the importance of this standard, and the supporting databases to event processing application is noticable
On lack of monitoring
Recent financial news indicate that the Financial Industry Regulatory Authority has fined one of the analysts firms for failing to supervise equity research analyst communications with traders and clients and for failing to adequately monitor trading in advance of published research changes to detect and prevent possible information breaches by its research analysts. This is interesting, since it seems that the regulator now expects firms to monitor the consequences of their actions. This calls for real-time monitoring, monitor causalities between various events, and eliminate increased trading based on unpublished information. Recently, I have participated in a meeting with stock exchange people in a certain country, and heard about their efforts to detect undesired phenomena in trade. They also were looking at real-time monitoring, and even proactive behavior, trying to detect undesired phenomena before they happen. I guess that we'll see more of these applications from different sides - regulators, traders, and in this case analysts.
ebizQ’s Business Agility Watch
On event server as the 21st century application server
Paul cites TIBCO CEO Vivek Ranadivé in TIBCO's quarterly earning report, and concludes that an event server will is a requirements in many applications that process events in various ways.
Getting to the notion of application server (see illustration below taken from an article on Websphere Application Server)
Application servers are intended to support services to applications such as: transaction, storage, database approach, security, high availability, administration and more.
In the event-driven world there are flowing events, and with the Internet of Things, most data in the universe will be in form of events. In the event processing manifesto (that Paul has been one of the contributing members to its creation) we talked about "event fabric" which will enable Internet scale sharing of events and will support many applications. Some of the fabric properties mentioned were providing services of privacy, security, interoperability among fabric instances, provenance, energy efficiency, autonomic computing support (self-tuning etc...), availability, scalability, anonymity, non-repudiation, QoS with multiple criteria. These are some of the services, and there are of course functional services like context service, adapter and transformation service, filtering service, aggregation service, pattern matching service and more that should be built into the server and can be used by various applications from various application areas and types (BPM, CRM, Social computing, track and trace and many more).
Paul rightly notes that standards have key role in establishing such event server, Paul indeed wrote the standards chapter in the manifesto.
I think that the equivalent of app server based on events is inevitable since events will be at the heart of all applications that take sensor data an input. Work on standards in this area is an old dream, and hope that we'll be able to advance towards it.
On two-tier analytics
Will Cappel from Gartner has written about two-tier analytics and went back to Immanuel Kant (in the picture above) as support to his thesis. Kant argued that the human cognition work in two levels: the first level that grasps objects and raw facts about them, the second level which captures causality between these objects over space and time, applying some levels of simplification to what Kant said, he is right. Cappel makes the analogy to the analytics world, and says that the first level is satisfied by event processing that process events by filtering, transformation and pattern detection to identify higher level situations. The second level is satisfied by pattern discovery engines that work on top of the first level. This is an interesting observation, I think that the picture is somewhat more complicated as there are more tiers. Event processing detect patterns in real-time, and indeed one of the ways to obtain these patterns are the pattern discovery mechanism over historical data, which may include the results of event processing systems, but should also include many other data items that describe the impact on the environment, since situation detection triggers actions, and actions impact the environment, the pattern discovery needs feedback from the outcome along with feedback from the process itself. The interesting part comes when we add real-time adaptation to the picture, here, in a similar thing to how the cognition works, the causality relations may change on the fly. Consider traffic management systems, studies show that these systems are chaotic in nature and one cannot forecast patterns of behavior based on past experience with sufficient accuracy, forecast is limited to about 15 minutes to the future in some cases and the control policies for highway should constantly adapt. Here we need four tier analytics:
The first tier is the off-line tier which change the setting of the system based on historical learning. The second tier is the event processing tier which observes and monitorsThe third tier is the real-time forecasting tier which adapts the causalities and make the short term forecastingThe fourth tier is the real-time decision making tier which makes the best decision possible within the time frame allocated for the decision (which may not be the global optimized solution).
Bottom line: I agree with Cappel about the multi-tier approach, and pointing out that reality is somewhat more complicated...
Building the Internet of Things – with Microsoft StreamInsight and the Microsoft .Net Micro Framework
Fresh from the press – The March 2012 issue of MSDN Magazine features an article about the Internet of Things. It discusses in depth how you can use StreamInsight to process all the data that is continuously produced in typical Internet of Things scenarios. It also gives you an end-to-end perspective on developing Internet of Things solutions in the .NET world, ranging from the .NET Micro Framework application running on the device, the communication between the devices and the server-side all the way to powerful cross-device streaming analytics implemented in StreamInsight LINQ.
You can find an online version of the article here. Happy reading!
Regards,
The StreamInsight Team
Computer Scientist Drives for Comprehensive Traffic Model
A High-Level StreamInsight 2.1 Preview
Hello Folks,
With StreamInsight 2.0 barely out the door, it may seem soon to start talking about the next version, but the team has been busy adding features and trying to keep to our usual 6-8 month release cycle. And with 2.1 shaping up to be a pretty significant release for us, we’d like to start giving a preview.
I’m not going to dig into a lot of technical details in this post – those will be forthcoming – but I would like to give a high-level overview of what we’ve done and why we’ve done it. Let’s start with motivation.
We’ve received a lot of feedback on StreamInsight 2.0 and earlier. To sketch some, we’ve heard that:
- The object model is somewhat hard to understand. E.g., what exactly is a query vs. a query template?
- It’s difficult to write the basic plumbing. And the state machine adapters need to adhere to makes them particularly hard to write.
- Although many of the aforementioned problems can be avoided by using sequence input (IObservables and IEnumerables) and output, they are restricted to the embedded host: as soon as you want to run remotely, you have to use adapters and the full object model.
- Using checkpointing likewise requires that you abandon sequence input.
- The query topologies supported by checkpointing are too limited. In particular, many users want to have a single query connect to a remote data source, ingest the data into StreamInsight, and present it to other queries. This is done through the use of published streams, the use of which precludes checkpointing.
In addition to this, the Reactive (Rx) community has been asking for a server like our remote host.
While any code written against StreamInsight 2.0 and earlier remains supported, StreamInsight 2.1 includes a rather large update to our programming surface that address all of these. We have:
- Created a new object model that is much more clear and consistent. We’ll talk about details in another post, but this object model is heavily influenced by Reactive’s sources, sinks, and subjects.
- Supported observable and enumerable workflows in the server. These can be combined with temporal logic or used independently. E.g., you can use Rx to marshal data into StreamInsight, or you can use the server to host solely-Rx pipelines – whichever best matches your workflow.
- Eliminated the need to use the complex adapter contracts in most cases. Instead of adapters, ingress and egress can be handled with observables and enumerables. Adapters remain fully supported in the new model, so you can continue to use them.
- Expanded the set of workloads that support checkpointing to include those with shared computation.
We’ll be sharing more details about the upcoming release over the next few weeks.
Cheers
-Isaac
