"In many event processing systems […] events are immutable"1. This stems from the definition of what an event is: "An event is an occurrence within a particular system or domain; it is something that has happened, or is contemplated as having happened […]"2. So events cannot be made to unhappen.
Open Question: Does this apply to all systems/applications/usecases or just to "many" as stated above?
I made immutability a general assumption in my work3. It is very useful for building systems (distributed systems, consistency, …).
Q: How can a Stream processing agent process events if they are immutable?
A: Every processing task produces new derived events as results. Advantage: the underived events are still available for other uses and remain immutable.
For <abbr title="RDF Stream Processing">RSP this means: (1) create a new (unique) graph for the derived event (2) possibly link back to the base event(s) thus enabling drill-down or root cause / provenance analysis of the derived event. The links can be made with DUL:hasConstituent from DOLCE Ultralight4. In my own work5 I use a new :members property to link from a derived event to its simple events. The property is a subproperty of the mentioned DUL:hasConstituent.
Observation: We talk about adding "received time" and other metadata later by receiving agents: Adding triples later to the event graph with graphname as subject can still be legal and considered as amending the event header. Much like with email: headers can be added by intermediate mail servers but the mail body and ID are immutable.
”A punctuation is a pattern p inserted into the data stream with the meaning that no data item i matching p will occur further on in the stream.”
For event processing systems, events are the fundamental unit of information3. This means each event is processed atomically, i.e. completely or not at all. For RDF stream processing systems this can cause problems if events are modelled as graphs consisting of multiple quadruples: How can a receiver of an event know that all quadruples pertaining to the event are transmitted in order to start processing the event?
For streams of RDF graphs punctuation can be used like this: A punctuation is a pattern ”p” inserted into the quadruple stream with the meaning that no quadruples i from graph p will occur further on in the stream.
Punctuation could be implemented using special ("magic") quadruples but when using the Web stack(!) we can do punctuation out-of-band, i.e. implement punctuation on a lower layer of the stack. For example, we can communicate through ”chunked transfer encoding” (Fielding et al. 1999, Section 3.6.1)4 from HTTP 1.1. Each chunk contains a complete graph and the receiver will know that after a chunk is received the event is completely received and can be processed further in an atomic fashion. There is a guarantee that no quads for this graph will arrive later. Using HTTP chunked connections no special (or magic) quads are needed.
”Chunked transfer encoding” is also used by the RDF publish/subscribe middleware Ztreamy5 to provide long-lived connections using pure HTTP with the goal of disseminating events to subscribers. Further related work6 investigates the exchange of RDF over different protocols such as XMPP on top of HTTP (and thus TCP) but even UDP. However, none of these protocols provides pure HTTP stream URIs which are easily referenced in Linked Data.
Maier, D.; Li, J.; Tucker, P.; Tufte, K. & Papadimos, V. Semantics of Data Streams and Operators Proceedings of the 10th International Conference on Database Theory, Springer-Verlag, 2005, 37-52 [http://datalab.cs.pdx.edu/niagaraST/icdt05.pdf] ↩
Gupta, A. & Jain, R. Managing Event Information: Modeling, Retrieval, and Applications Managing Event Information, Morgan & Claypool Publishers, 2011 ↩
Fisteus, J. A.; García, N. F.; Fernández, L. S. & Fuentes-Lorenzo, D. (2014), ‘Ztreamy: A middleware for publishing semantic streams on the Web ‘, Web Semantics: Science, Services and Agents on the World Wide Web 25(0), 16 – 23. ↩
Real-time has become one of the crucial characteristics of modern applications and is completely changing the game in the data processing. Due to its capability to support continual monitoring, real-time data processing has become a very important mechanism in many application areas: traffic management, logistics, eHealth, smart grids, to name but a few. In this talk we present technologies to deal with real-time data on-the-fly, challenges and possible solutions to deal with these challenges such as using Web-friendly standards to create open and extensible systems for real-time data.