Saturday, December 29, 2007

Kudos to Streambase Support

Thanks to Steve Barber and some of the other members of the Streambase tech support team. I could not get an out-of-process, .NET-based adapter to communicate with the Streambase engine on my laptop, although it worked fine on my office machine.

The Streambase people worked during the slow holiday season to diagnose the problem. It was caused by a DLL called BMNET.DLL that was installed by my Cingular/ATT Communications Manager. I have a Cingular broadband card that I use to connect to the internet when I am on the road and don't have a free wireless connection to tap into. BMNET.DLL provides data acceleration to the Internet.

Microsoft references this problem here: http://support.microsoft.com/kb/910435



©2007 Marc Adler - All Rights Reserved

Aleri Evaluation

Just a small word about the Aleri evaluation. Several of you have repeatedly pinged me to find out what I thought of Aleri, so I am going to write down some of my impressions. The usual disclaimers apply, such as this is my opinion and does not necessarily represent the opinions of my employer.

My impressions were formed after a superficial evaluation of Aleri against some of the other vendors. I have not gotten to the point yet where I am testing the latency and throughput of the CEP engines. I have not soaked and stressed the CEP engines to see if any of them are leaking memory. I have not tried them on a variety of processors, and I have not examined their performance under multicore processors.

In a nutshell, my biggest area of concern with Aleri was the "spit-and-polish" of the product. They very well might have the fastest CEP engine out there. However, I was stymied by the quality of the documentation, and my perceptions of their Aleri Studio. It also seemed that they were more of a "system integrator" that some of the other CEP firms, taking separate products like OpenAdaptor and JPivot and trying to fit them into a holistic offering.

An example of this was reflected in the difficult time I had in getting Aleri integrated with SQL Server 2005 through Open Adaptor. The documentation was non-obvious, and it took many hours with their sales engineer to finally get it connected. I compare this to Streambase and Coral8, where it took all of 5 minutes to hook up an SQL Server 2005 database to their systems (disclaimer: there is a problem getting Streambase, Vista and SQL Server to work together, although Streambase has not yet released an official version for Vista).

That being said, the salient points are:

1) Aleri's management team (Don DeLoach and Jeff Wootton) fully realize their short-comings in the department of aesthetics, and have promised me that they are actively addressing it. I would highly recommend that Aleri look at Streambase, whose total package is good with regards to documentation and tutorials. (However, I still find a lot of pockets of geekiness in the Streambase documentation.)

2) The Aleri sales engineering team, led by Dave, got a variant of my initial use case to work. However, there are features that Aleri does not yet have, such as jumping windows and pattern matching, that make Coral8 and Streambase stand out.

3) Going through Open Adaptor is not fun. Streambase and Coral8 make it simple to write adapters in C#. The Aleri sales engineer told me that he usually has to help clients to get adapters to work. That is really not the message that a company wants to hear if they have many legacy systems to interface with.

4) Aleri has real-time OLAP, using JPivot. To my knowledge, they are the only CEP company to offer real-time OLAP. I did not really get to see this feature, but true real-time OLAP is something that a lot of financial companies are interested in. We want to be able to slice and dice our order flow in real time over different dimensions.

5) The Aleri Studio uses Eclipse, just like Streambase, and the icons even look exactly like Streambase's icons. However, the user interaction seemed a bit shaky at times, and there were moments when I got myself into "trouble" with the Aleri Studio by clicking on one thing before clicking on another thing. Again, Streambase seems more solid. And, Coral8 does not try to impose a GUI builder on the developer. The guys at Aleri told me that they are addressing these issues.

I was really pulling for Aleri, since the development center is about 15 minutes from my house in New Jersey. They are right down the block from the hallowed halls of Bell Labs in Mountainside, and some of the developers live in the next town over from me. You couldn't meet a company of nicer guys, and the CEO is a very low-key guy compared to other CEOs that I have met. I was impressed by the fact that, at the Gartner conference on CEP, he stood up in front of the audience and exhorted us to try different CEP products.

I anxiously look forward to the 3.0 version of Aleri's offerings, and to see tighter, easier integration between their various components, enhanced documentation, enhanced support for .NET, and a cleaner version of their Aleri Studio. Given the quality of the developers there, I am sure that this version will kick some butt.



©2007 Marc Adler - All Rights Reserved

Where are the CEP Customers?

It seems that, lately, every blog posting I make on CEP generates further blog postings from the vendors and the subject-matter experts in the CEP space. It's great to see the CEP blogs become more active, so that I can tap into the collective wisdom of people like Tim, Marco, Mark, etc.

However, where are the postings from other customers? Isn't there someone from Goldman, Lehman, Merrill, etc who wonder about the same things that I do? Or, do these people mainly purchase the pre-packaged algo trading packages that the vendors have to offer.

One thing that was very interesting was the copy of the Aite Group report on Complex Event Processing that somebody had forwarded to me. It seems that most of the CEP companies number their customers in the dozens, rather than the hundreds or thousands. It seems that we are either at the very beginning of the explosive group that Aite predicts, or not many companies are finding a use for CEP, relying instead on their legacy apps to deal with streaming events. Certainly, in my own company, we have hand-written code for the various algo and HFT trading systems, code that I am sure the developers have profiled the hell out of in order to get the max performance. We would be hard pressed to replace this code with a generic CEP system.

If you are a customer or potential customer of CEP, I offer the opportunity to send me private comments at magmasystems at yahoo.





©2007 Marc Adler - All Rights Reserved

Wednesday, December 26, 2007

Visualizations Update

Stephen Few is rapidly positioning himself as the guru of business visualizations. His name has been brought to my attention several times over the past few weeks as someone to pay attention to .... "a new Edward Tufte", if you will.

Few has an online library with a lot of free articles to read. Right now, I'm reading Multivariate Analysis using Heatmaps. This is especially worthwhile reading following last week's visit by Richard and Markus of Panopticon, who showed more reasons why we should graduate from the free Microsoft heatmap control to the more feature-laden, doubleplusunfree, Panopticon product. As Panopticon adds more features in the value chain, it will be increasingly difficult to justify using a free product.

------------------------------------

Which brings me to another point that I have been thinking of ... a point that I raised on my previous blog posting. In the field of Enterprise Software, where do the responsibilities of a vendor begin and where do they end?

Take Panopticon, for instance. You can bind a streaming "dataset" to Panopticon, and Panopticon will render a realtime updating Heatmap to visualize that dataset. Of course, you ask how you get data into Panopticon, and you come back with the concept of input adapters.

Then, gradually, you wonder if their input adapters cover KDB, Wombat, Reuters, Vhayu, OpenTick, generic JMS, sockets, etc.

Then you wonder if Panopticon has input adapters that take the output of CEP engines, like Coral8 and Streambase. Or, you have a crazy thought like Panopticon embedding a copy of Esper/NEsper inside of itself.

Then, you get really greedy and wonder if Panopticon provides built-in FIX adapters that will devour a FIX 4.4 stream of orders and executions and show you what exchanges are slow today.

Then you wonder what kinds of analytical tools Panopticon might interface with ... since Panopticon is doing parsing and analysis of the streaming data anyway, can't it just take an extra step and analyze the silly data.

But, then if you are demanding all of these things of Panopticon and Coral8, how do you hook them together? Does the dog wag the tail or does the tail wag the dog?

Or, do we just consider Panopticon a simple visualization tool, demanding nothing more of it then the ability to display brightly colored rectangles of streaming data, and likewise, do we ask nothing more of Coral8 than to do what it does best ... recognize patterns and perform filtering and aggregations.

As Dali-esque as these thoughts may appear, this is the kind of things that I need to consider. In my quest for an ecosystem around the CEP engine, do we ask for the CEP engine vendors to expand outwards, or do we take the outer layer of components (ie: the visualization and analysis tools) and ask them to expand inwards to meet the CEP engine. Whatever it is, my wish would be for a true plug-and-play architecture between the CEP engine, its input components, and its output components.



©2007 Marc Adler - All Rights Reserved

CEP Vendors and the Ecosystem

While my wife and son cavort around Australia and New Zealand for the next few weeks (I get to stay home and watch my daughter, who only has one week off from high school), I hope to be able to catch up on some of the blog posts that I owe people.

One of the things that is most important for me in choosing a CEP vendor is the ecosystem that surrounds the CEP engine. In a company such as mine, we need to interface with many different legacy systems. These legacy systems can hold crucial data, such as historical orders, customer trades, market data, volatility curves, customer and security reference data, etc. This data may reside statically in a database, be published out as flow over some kind of middleware, or interfaced with an object cache or data fabric. We have every color and shape of database technology in our firm, whether it be more traditional relational databases like Oracle, SQL Server, and Sybase, or newer tick databases like KDB+.

From the input and output points of the CEP engine, we need seamless integration with all sorts of systems. Most CEP engines have the concept of in-process and out-of-process adapters. In-process adapters are more performant that out-of-process adapters. We would love to see as many in-process adapters delivered out-of-the-box by our CEP vendor. We do not want to spend time writing our own in-process adapters.

So far, none of the CEP vendors support KDB+ as an out-of-the-box solution. In fact, many of the CEP vendors did not even know what KDB+ was. (Is the same true for Vhayu as well?) My feeling is that, if a CEP vendor is going to be successful on Wall Street, then they must support KDB+. Is it even feasible for the CEP vendors to provide an abstraction layer around KDB+, and let the CEP developer write all queries in SQL instead of writing them in K or Q?

One of the most important things that I would like to see from the CEP vendors are tools to enable the analysis of all of the data that pass through the CEP engine. Many groups might not have the budget to hire a specialized mathematician or quant to perform time-series analysis on the data. Learning specialized languages like R or SPlus might not be possible for smaller groups that do not have a mathematical bent. The same goes for packages like Mathematica and Matlab.

Would it be worth it for the CEP vendors to come out with a pre-packaged "stack" for various financial verticals that incorporates analysis tools? Or, would writing a detailed cookbook be better? And, where does the responsibility of the CEP vendor end? Should we expect the CEP vendor to provide a one-stop shop for all of our needs, or should be just expect the CEP vendors to provide strong integration points?

Better yet, does this open up an opportunity for a third party company to provide this service? Like the many laptop vendors who buy a motherboard (the CEP engine), and slap together a disk drive, CD drive, screen and keyboard to make a complete system?

In examining the various CEP vendors, I have come to the conclusion that the offerings from Streambase, Coral8 and Aleri are very similar. Given another year, I might expect each vendor to fill in the gaps with regards to their competitors' offerings, and at that point, we might have practically identical technologies from 3 different vendors. In my opinion, the real win for these CEP vendors will come in the analysis tools they provide.


©2007 Marc Adler - All Rights Reserved

Saturday, December 22, 2007

At Eaton Airport, October 2007

Plane Ride to Hamilton, Oct 2007 002

My shirt is hanging out of my fleece, but nevertheless, the foliage was beautiful at Eaton Airport in Norwich, New York. My Mooney M20G is right behind me.

©2007 Marc Adler - All Rights Reserved

Wednesday, December 19, 2007

Blog Momentum

On December 18th, I looked at my ClustrMap and saw that I had 239 visitors on the previous day. While this is nothing compared to the big blogs out there, it is definitely an improvement from the 30-40 per day that I had last year. I am not sure if my involvement with CEP has anything to do with it, or if it is becoming more well known across the Wall Street community (I believe it's the latter).

The levels of comments have increased too. Also, the number of private emails that I get have also increased, as a lot of you share your opinions about working on Wall Street, evals of CEP vendors, etc.

Some of you wonder if I can get into trouble with my company for the blog. Let me tell you that there are a HUGE number of people from my company who read my blog, not only from the tech side, but from the business side as well. The former CTO of our company was a big blog reader of mine. The CIO of our company has his own internal blog, and I know that people have recommended my blog to him.

Take a look at what other big financial companies are doing with blogs and wikis. Enlightened management at financial institutions will embrace social networking and a more open community, not run scared at the thoughts of what a competitor might pick up from a blog. Wall Street and The City are very close-knit communities, and more information passes between the employees at Lehman and Goldman during Happy Hour than whatever can be revealed from a blog.


©2007 Marc Adler - All Rights Reserved

Tuesday, December 18, 2007

Bonus Season

Word is trickling across the Street about Goldman's bonuses. Everyone knows the numbers of an average bonus of over $600,000 per employee. But the whispers that I have heard from a number of people are that the IT employees below Managing Director level did not fare very well.

In conversations that I have had with a number of current and ex colleagues, we are speculating whether most Wall Street companies will turn off the bonus spout, and admonish employees to "Just try to go anywhere else! Nobody is hiring now!".

What will be the likely scenario is a 10% reduction across the board of the lowest performers. I am going long in the steak-knife manufacturers.

What you don't want to see is the de-motivation of the normal, punch-the-clock worker in the IT field. These are the people who score in the middle of the road on their yearly evaluations, the people who have been at a company long enough to know one system inside and out, the people who don't blog nor read blogs, the people who don't work on weekends and who don't think about the next great product that they want to write for the traders. These are the people that usually hold the keys to the systems and the data, the people who you need to convince to open the gates to your applications. Giving these people the doughnut bonus will slow down processes, resulting in further roadblocks for the people who do want to get things done.

Nevertheless, I still have open headcount for the CEP project and for the .NET client framework team.

©2007 Marc Adler - All Rights Reserved

Farewell Kaskad

A player in the CEP space, Kaskad, is no more. Colin Clark writes in to say that, since the Boston Stock Exchange (BSX) has ceased to function, Colin had to disband Kaskad. I assume that since Kaskad did the work for the BSX as consultants, the BSX maintained all or most of the IP rights. Or, perhaps, Kaskad felt that since their only client was no longer funding their development, they could not gain further VC money in this financial environment.

According to the Aite report on CEP vendors, Kaskad had 16 employees. Since they are based up in Boston, I wonder if Streambase is looking at adding some additional talent.

Colin is looking for new opportunities in the CEP space, so contact him if you have anything that might interest him. Colin was involved in the old NEON (New Era for Networks) back in the dotcom boom, so he has the entrepreneural streak running through him.

©2007 Marc Adler - All Rights Reserved

Friday, December 14, 2007

Streambase (yet again ...)

After vowing to bypass Streambase in my CEP engine evaluation, I may be forced to eat crow. I agreed to let Streambase into the evaluation process because I need to have two CEP engines in my project ... one as primary and one as "cold backup". And, for various reasons, Aleri and Esper did not pan out for me.

The new CEO of Streambase, Chris Ridley, came down to New York to meet with me, with his chief architect, Richard Tibbetts, in tow. They acknowledged some of the errors of their over-aggressive marketing, and told me about their sharpened focus on the financial industry marketplace.

They also let me have an eval version of Streambase that is not constrained by any license key, and in the interests of expediency, they graciously allowed me to bypass their eval agreement (which would have taken weeks to make it through my company's legal processes at this time of the year).

I installed Streambase on my laptop. My first impressions are ..... "slick". In other words, all the superficial, glossy stuff that gives the initial impression to a prospective customer is all there. Nice documentation with plenty of graphics, a great interactive tutorial, etc. I was "warned" that Streambase puts a lot of time into their studio and help system, and I can definitely concur. Nice job, guys.

I am going through the tutorials now. Several things jump out at me right away:

1) They use Eclipse as the foundation of their Streambase Studio. I am quickly becoming a fan of Eclipse, especially the way that you can automatically update Eclipse plugins.

2) The development methodology is more "file-based" than the other products. A familiar paradigm to Java/C# developers.

3) There are two ways to develop apps. The Event Flow method uses a GUI-based method. You can also program in StreamSQL. Unfortunately, there is no tie-in between the Event Flow and the Stream SQL files. In other words, unlike Coral8, if you make a change in the Event Flow, it does not get reflected in the StreamSQL file. In your project, you can have multiple Event Flow files and multiple StreamSQL files. I would love to be able to develop in either system, and have them automatically translated to the other system.

4) There are certain things that you need to do in the Event Flow system that you cannot do in StreamSQL. There are comments in their demo programs to this effect. I would welcome a document that outlined these differences.

5) I noticed that the icons used in the tool palette are identical to the ones that Aleri uses. Interesting. Someone looked at the other company's product.

6) Richard Tibbetts and Mark Tzimelson are very respectful to each other's work. Nice to see that kind of respect at the technical level.


©2007 Marc Adler - All Rights Reserved

Tuesday, December 11, 2007

Acropolis Shrugged

http://blogs.msdn.com/gblock/archive/2007/12/07/if-acropolis-is-no-more-what-s-our-commitment.aspx

CAB? Acropolis? CAB? Acropolis? CAB? Acropolis? CAB? Acropolis? CAB? Acropolis?

It looks like Acropolis may be going by the wayside, and the Patterns and Practice Group has decided that, for today, they will refocus on CAB.

This is precisely why we build our own .NET frameworks in my investment bank. Because, we need a high quality framework that has some domain knowledge of what capital markets needs. Goldman, Wachovia and Morgan Stanley have done the same.

And, this time last year, Microsoft came in and tried to get us to adopt CAB, and the week after that, they told us that Acropolis was the new flavor of the day. Thank the lord that we were focused on a mission to build what we wanted and needed, without all of the background noise from Microsoft. (Sorry Joe...)





©2007 Marc Adler - All Rights Reserved

Get Rich Quick with KDB+

I am convinced that the world needs more KDB+ consultants. The supply of these creatures is so small, that if you end up needing one in a hurry, you probably have to go through First Derivatives.

KDB+ is used by most of the Wall Street companies --- I can name Citigroup, Barclays, Bank of America, and Lehman as big KDB+ users --- to store tick and order data. KDB+ 's biggest competitor is probably Vhayu.

The main blockade to learning KDB+ is their programming languages - K and Q - which can make APL look verbose!

If you are affected by the upcoming layoffs on Wall Street, and if you are looking for a new, exciting career change, and you don't relish the idea of selling steak knives door-to-door, then there is room in this world to be a KDB+ consultant.


©2007 Marc Adler - All Rights Reserved

Sunday, December 09, 2007

OpenAdaptor, SQL Server 2005, and Aleri

This posting was made in the interests of any Aleri or OpenAdaptor users who are trying to connect to a named SS2005 instance.

I have multiple "instances" installed of SQL Server 2005. According to the documentation at http://msdn2.microsoft.com/en-us/library/ms378428.aspx, this JDBC connection string should have worked:

jdbc:sqlserver://MAGMALAPTOP\RPT;databaseName=MyDatabase;integratedSecurity=true;

In Microsoft's JDBC Driver 1.2 for SQL Server 2005, there is a sample Java app called connectURL. With the connection string above, this sample app worked fine, and was able to connect to the RPT instance of my SS2005 database.

However, I could not get OpenAdaptor to work with this connect string. In case you are wondering why I was messing around with OpenAdaptor, it is because this is what Aleri uses for its adapters to external data sources.

After spending several hours this weekend trying to get Aleri to connect to SQL Server using the connection string above, I finally stumbled upon an alternative syntax for the connection string.

The new connection string is:

jdbc:sqlserver://MAGMALAPTOP;instanceName=RPT;databaseName=MyDatabase;integratedSecurity=true;

Notice that the instanceName is specified with a separate parameter.

So, there may be an issue with OpenAdaptor. Or, another theory that I have is that the backslash character in the connection string is being considered as an escape character.


©2007 Marc Adler - All Rights Reserved

Saturday, December 08, 2007

Getting intermediate results in Streams

Let's say that we want to keep a running total of the number of shares that we have traded, and at 4:00 PM every day, we want to dump out the total. In Coral8, we can do something like this:

CREATE LOCAL STREAM Totals (TotalShares INTEGER);

INSERT INTO Totals
SELECT SUM(shares)
FROM TradeInputStream KEEP EVERY 1 DAY OFFSET BY 16 HOURS
OUTPUT EVERY 1 DAY OFFSET BY 16 HOURS;

This looks pretty straightforward. The Totals stream retains the totals until 4:00PM. At 4:00 every day, it outputs the total shares to any other stream that is "subscribed" to Totals, and then resets itself to start accumulating new totals.

This is something that CEP engines are good at, whether it be Coral8, Aleri, or Esper.

Now, let's enhance this a little bit.

Let's say we give the traders a .NET GUI application, and on this GUI is a "Status" button. The traders can press this button any time they want to know how many shares have been traded so far that day. So, at 2:00, a trader pushes a button on the GUI and we need to return to him the number of orders seen so far that day, the number of shares seen, the notional value of all orders, etc.

So, there are two questions:

1) How can we "dump out" these accumulators on demand? In other words, is there a way to tell these CEP engines to give me the contents of an aggregation stream AS OF THIS MOMENT ?

2) How can we "call into" our CEP engine to retrieve these values? Do the CEP engines support an API that I can use from within the GUI to say "Give me the current value of a certain variable in my module"? Something like

IntegerFieldValue field = Coral8Service.GetObject("ccl://localhost:6789/Default/SectorFlowAnalyzer", "sum(Shares)") as IntegerFieldValue;
int shares = field.Value;

In a standard C# application, this would be as simple as putting a Getter on a variable, and just calling the getter. If I was using Web Services, then I could call into a Web Service and just ask for the values of some variables or for some sort of object. But, from a C# app, how can I get the current value of a stream that is aggregating totals?

Another way of accumulating the total number of shares in a CEP engine is to step into the procedural world, and just define a variable. In Coral8, it would be something like this:

CREATE VARIABLE TotalShares INTEGER = 0;

ON TradeInputStream
SET TotalShares = TotalShares + TradeInputStream.shares;

Then, we would need a "pulse" to fire at 4:00PM every day, and upon this pulse firing, we could send the TotalShares to another stream.

I am sure that there are patterns in every CEP engine for accessing intermediate results, but something that is a no-brainer in a procedural language may not be so easy in a CEP vendor variant of SQL.



©2007 Marc Adler - All Rights Reserved

Friday, December 07, 2007

Coral8 and Transparency

I just tried to get some info on the Apama Event Processing solution (not their Algo Trading platform,just the simple ESP platform). I filled out a form, and now I have to wait for a Progress sales rep to call to arrange a demo. Even if I want to see an Apama webcast, I need to fill out a form.

Let's contrast this what Coral8 has to offer. Coral8 lets you download the entire developer platform, with all of the documentation included. Everything is included .... there are no important packages that are missing with the eval version. There is no 30-day license key that you have to get. There is no waiting for a salesperson to get in touch. As far as I know, you get everything is ready to go from the time you download the package.

I fail to understand why certain vendors make it so difficult to evaluate a package. In a big financial institution like the one I work for, if you use software in production and this software is not properly licensed and paid for, it is grounds for termination of your job.

Coral8 has the right attitude. Just get it into the hands of as many people as possible as spread the word around.

©2007 Marc Adler - All Rights Reserved

NEsper Docs

Good solid docs on the Java version. Unfortunately, the NEsper version references the Java docs. The NEsper version only comes with an auto-generated CHM file.

Looks like I will have to compile the examples and dig through the source code of the examples in order to see how to use NEsper. Thomas, it may make sense to outline the difference between Esper and NEsper in the master documentation ... maybe using highlighted text boxes to outline the differences.

It also may make sense to include the PDFs in the NEsper distrubution.


©2007 Marc Adler - All Rights Reserved

Per Se

One of the perks of working for Mega-Bank is that you get invited to sales and marketing functions that were previously out of reach to me when I was a consultant. Such was the case this past Wednesday, when I was invited to a marketing function by Autonomy at Per Se.

If you have never heard of Per Se, then maybe you have heard of The French Laundry, a Napa Valley eatery that is run by reknowned chef, Thomas Keller. Per Se is Keller's New York City version of the French Laundry, and is one of the most difficult reservations to get in New York.

Autonomy has quarterly executive briefings at Per Se, where they bring together the sales of management of Autonomy, various Autonomy business partners, current customers, and future prospects. In addition to a fantastic lunch, we got to see how Standard and Poors used Autonomy to help their analysts get through the millions of pages of regulatory filings. S&P has a team of PHD mathematicians that have have developed some fairly sophisticated models in Autonomy to help them extract the "meat" out of their stream of documents.

Autonomy seems to positioning themselves as a major add-on in the Sharepoint marketplace, adding very sophisticated document searching. It would be interested to compare Autonomy with things like Google Search and X1.

©2007 Marc Adler - All Rights Reserved

On to Esper/NEsper

I have had to expand the CEP evaluation process. I am going to start looking at NEsper, and maybe, Apama (after some recommendations by some of our Asia folks who seemed pleased at Apama's performance over heavy loads).

I just downloaded NEsper and I am starting to go over some of the docs. I am sure that Thomas and Aaron will correct me if I say anything incorrect about Esper. Two things stand out about the Esper offering:

1) No GUI Builder or visual tools, probably because .....

2) Esper/Nesper is a component designed to be incorporated into your application ... in other words, it is treated as a third-party .NET assembly, just like things like Syncfusion, Log4Net, etc.

So, unlike Coral8, where you run a separate Coral8 server process, you need to write an application that "contains" Esper/NEsper. While this solution does not favor quick, out-of-the-box prototyping as Coral8 does, it gives you more control over the CEP actions. Everything with Esper/Nesper is "in-process" to your application.

Also, Esper/Nesper is Open Source. I downloaded it, and I have the full source code sitting on my hard drive. I have to talk to the financial guys at my company, but I don't think that we would have to do the amount of financial due dilligence with an Open Source effort as we would with a company who does not follow the Open Source model. Maybe we will have to count the number of moths that fly out of Thomas' wallet.


©2007 Marc Adler - All Rights Reserved

Sunday, December 02, 2007

Reducing Lock Contention

Here is a good blog posting on 10 Ways to Reduce Lock Contention. Even though our .NET-based client-side framework behaves well, I should take these hints and examine our framework with a fine-tooth comb.

I am also following all of the recent developments in Parallel Programming coming out of Microsoft. I wonder how much Joe Duffy's team interacted with the Accelerator guys from Microsoft Labs. I am also very interested to see if our Derivatives Analytics team, which is very strong in .NET, can leverage this new technology instead of/in additon to some of the proprietary technology offered by hardware acceleration vendors.

By the way ... I just started using Google reader. This blog is one of the blogs that Google Reader automatically recommended.

©2007 Marc Adler - All Rights Reserved

Some Random Comments

1) Go out and see the new Coen Brothers' film, No Country for Old Men. Absolutely startling. I saw it over a week ago, and I still can't stop thinking about it.

2) Please pay careful attention to the Comments section of each post here. Many of the CEP vendors are responding with interesting and important comments.

3) I need to start thinking about a notification framework for the CEP project. Notifications are a bit more complicated in financial firms where Chinese Walls have to exist, and where compliance officers are constantly over your shoulder. It's a good idea to involve the compliance guys right from the start, and to have THEM tell you where the Chinese Walls should be. Nothing will get their attention more that discovering that your prop traders are getting notification about your customer order flow!

4) I just subscribed to Alex's blog. Some interesting posts. Takeaways from skimming through his blog include:

a) I need to narrow down CPU pricing from the CEP vendors. Let's assume quad-core machines.

b) I am more curious about NEsper than I was before.

c) How will Esper and BEA stay in sync? Seems a bit annoying that BEA has seen fit to change some of the syntactic sugar of the original product.


©2007 Marc Adler - All Rights Reserved

Friday, November 30, 2007

Consultant Wanted - C# and Market Data Expertise

A vendor who we do business with (not a CEP vendor!) asked me to help them find an independent consultant for a few months. This consultant should be a C#/.NET expert, know market data APIs, and probably (I am guessing) be decent at communications. This consultant will be working with developers from the vendor, all of whom know C# at a basic level.

I will help them identify this consultant. The resulting work will be consumed by my team, as well as other financial firms that rely on real-time market data.

You can contact me at

XXXXX magmasystems XXXXXXX
XXXXXXXX at XXXXXXXXXXXXXX
XXXXXXXXXX yahoo XXXXXXXXX

©2007 Marc Adler - All Rights Reserved

Coral8 Update

We are finishing up the first phase of the Coral8 evaluation. This week, I met with Terry Cunningham (who flew his Falcon 10 out to meet us), and I had a great session with Henry, their pre-sales engineer. Terry was the creator of Crystal Reports, and later, the head of Seagate Software. I always have a soft spot in my heart for a fellow pilot .... even if his plane can go faster and higher than mine!

We validated that Coral8 was putting out the same output as our custom app, and I was enlightened on some of Coral8's capabilities that were not so easy to find in their wads of documentation. Although there is much good in Coral8, there were also some gotchas.

- Documentation needs to be consolidated a bit. There are a lot of separate manuals plus technical articles. There needs to be a "cookbook" on their CCL language.

- You cannot test a simple user-defined function without writing an intermediate stream. There is no simple way to dump a variable to the console. In other words, I would like to do this simple thing:

CREATE VARIABLE commission;
SET commission = CalculateCommission(); -- this is my user-defined function
CONSOLE.WRITE(commission);

- We managed to get the Coral8 Studio to freeze consistently. Luckily, no work was lost. The Coral8 Studio is written using wxWidgets, so I wonder how they do unit-testing on the studio.

My opinion is that, although it is great to have the advanced features, you still need to pay attention on the everyday, little tasks that developers need to do. Henry tells me that, in the future, Coral8 will move to a more Visual Studio, file-based way of developing. I certainly welcome this. Henry spent 8 hours watching me drive. When I was having problems, I verbalized the issues so that Henry could see what I was going through and he could bring the issues back to his management.

On the plus side :

- I have been reading about Coral8's pattern matching capabilities. We will definitely be exploring this.

- Coral8 has a relatively inexpensive barrier to entry. If we have a production, development, and COB servers (all dual or quad-core machines), then it won't break our budget.

- Their software does have any time limits on the evaluation versions. One thing that I do not like is a license key that is only good for 30 days. Given the nature of financial companies, we often get pulled into a lot of side projects. I don't want my time to be in the "thick of things", only to find out that the license key elapsed. Coral8 is very friendly to the evaluator.


Now, on to Aleri. I will be using their new 2.4 release.


©2007 Marc Adler - All Rights Reserved

Financial Due Diligence

In my super-mega Investment Bank, we work with all kinds of vendors. In fact, in our old group, one of the things that we were charged with was investigating all kinds of esoteric technologies that could give us an edge in the trading world.

If you use a vendor for a "bet the farm"-type application, then you want to make sure that the vendor behind the product is rock solid. My boss, who is the Global Head of Equities Technology for the firm, asked me who the "800 lbs gorilla" is in the CEP space. He wanted to make sure that we were safe in our choice, and that no matter how good the technology is, we did not put all of our eggs into a guy working nights in his basement (although, those kinds of companies usually make the best software!).

There is really no 800-lbs gorilla is the "pure" CEP space. By "pure" CEP players, I am talking about guys like Aleri, Coral8, Streambase, Esper, Apama, Truviso, Kaskad, etc. In this space, most of these companies number their customers in the dozens rather than the thousands. These companies are all competing for that big reference customer, the one that companies can stand back and say "This investment bank doubled their trading revenues because of our product."

Compounding this fact is the whole credit crunch and subprime mess. The New York Times had an article yesterday that described the effects that the credit crunch is starting to have on all sorts of companies ... even softeware companies and web-design shops were mentioned. So, we need to make sure that the CEP vendor that we choose is not affected, nor will be affected by the credit crunch. Coincidentally, one ex-employee of a CEP vendor sent me private email that mentioned that his company was undergoing a round of layoffs.

No matter which CEP vendor you choose, you should always have one backup. This is standard practice when dealing with small vendors. You should have the vendor's source code in escrow. You should have your financial people do a deep dive on a vendor's financials. Is the vendor self-funded or are they VC funded? Does the VC have the appetite to wait 5 to 7 years for a good return on their investment? What is the past behavior of the VC firms with regards to startups? What happens if the chief architect of the product leaves the company? Is the product written in a mainstream language, in case you need to take possession of the source code that's in escrow?

These are all questions that you should ask before making the final selection of your vendor.



©2007 Marc Adler - All Rights Reserved

Saturday, November 24, 2007

First Use Case Done with Coral8

I now have Coral8 detecting when a sector has abnormal activity, and I have a Coral8 Output Stream publishing into a .NET application for visualization. If I want to, I can take the data from the Coral8 alert, transform it into a JMS message, and publish it out on the Tibco EMS bus for other applications in the organization to consume. Or, I can publish it out to another Coral8 stream.

Well done, Coral8 team!

Now, it's on to the Aleri evaluation. The good people at Aleri's sales engineering team have done most of this first use case, but now that I am armed with more Coral8 knowledge, I need to try to rebuild the Aleri use case from scratch by myself.

©2007 Marc Adler - All Rights Reserved

Thursday, November 22, 2007

More on the first CEP Use Case

Yesterday, I had a great two-hour session with Henry and Bob from Coral8 in which most of the use case was done.

Henry is our designated pre-sales engineer. His job is to do what it takes to make sure that the prospective customer is happy with the product before making a decision to purchase the product. Bob is the head architect of Coral8, and his job (as he described it) is to make sure that the product is as easy to use as possible.

Between Henry and Bob, two solutions were offered. I will go into the first solution in this blog entry. The second solution revolves around custom timestamping of messages by the input adapter, and this topic deserves a blog entry of its own.

The main problem was to analyze the order flow for each sector over a one minute timeslice, and determine if any sectors showed abnormal activity. The problem that I was faced with was that the concept of “time” was determined by the TransactTime field in the FIX message, and not by the “clock on the wall”. So, if for some reason, I received two FIX messages in a row, one whose TransactTime field was 14:24:57 and one whose TransactTime field was 14:25:01, then the receipt of the second FIX message should cause a new timeslice, regardless of what the wall clock said.

The solution that Henry came up with was to use a pulse in a stream. Although the concept of raising an event is very common is programming, it is not really something that you tend to do in SQL stored procedure. The thing is that programming in Coral8’s CCL (as well as the SQL-like dialects that many of the CEP vendors have) is a combination of procedural and SQL programming, and the trick is to find the correct “pattern” to solve your problem. This is where many of the CEP vendors can improve; they can publish a listing of patterns, they can come up with FAQs, etc. I mentioned this to Bob of Coral8, so expect to see some movement on this front from the Coral8 folks.

Here is what the pulse stream looks like in Coral8’s CCL:

----------------------------------------------------------------------------------
-- LastTimeSlice holds the maximum timeslice (0 to 389) of the order stream.
-- When we see an order with a TransactTime greater than the current max timeslice,
-- then we set the new max timeslice. We also use this as a signal (pulse)
-- to one of the streams below.
----------------------------------------------------------------------------------
CREATE VARIABLE INTEGER LastTimeslice = -1;
CREATE LOCAL STREAM stream_Pulse;

INSERT INTO stream_Pulse
SELECT
TimeToTimeBucket(FlattenNewOrder.TransactTime) AS epoch
FROM
FlattenNewOrder
WHERE
TimeToTimeBucket(FlattenNewOrder.TransactTime) > LastTimeSlice;

-- When we insert a new timeslice into the stream_Pulse stream, we also
-- set the new maxmimum timeslice.
ON stream_Pulse
SET LastTimeSlice = stream_Pulse.epoch;

We have a global variable that keep the maximum timeslice that is flowing through our system. Since there are 6.5 hours in the trading day, there are 390 minute-sized timeslices that we want to consider.

In the INSERT statement, if the timeslice from the incoming FIX message is greater than the current maximum timeslice, then we insert a new record into the pulse stream.

The ON statement functions like a trigger. When a new record is inserted into a stream, you can have one or more ON statements that react to the event of inserting a record into the stream. Here, we set the new maximum timeslice.

We need to maintain a Window that contains all of the orders for the current timeslice. The order information includes the stock ticker, the sector that the stock belongs to, the number of shares in the order, and the current timeslice. In Coral8, a Window provides retention of records. You can specify a retention policy on a Window, whether it been a time-based retention policy (keep records in the window for 5 minutes) or a row-based retention policy (keep only the last 100 rows). What is missing here is a retention policy based on a boolean expression or on a certain column value changing. Streambase has this, and Coral8 knows that this feature should be implemented down the road.


----------------------------------------------------------------------------------
-- The TickerAndSector window holds all FIX orders for the current timeslice.
-- Each row of the window contains the FIX order and the sector information.
-- When we see a new timeslice, the TickerAndSelector window is cleared
-- using a DELETE statement.
----------------------------------------------------------------------------------
CREATE WINDOW TickerAndSector
SCHEMA (Ticker STRING, SectorName STRING, SectorId INTEGER, Shares INTEGER, TransactTimeBucket INTEGER)
KEEP ALL;

INSERT INTO TickerAndSector
SELECT
FlattenNewOrder.Ticker,
TickerToSectorMap.SectorName,
TickerToSectorMap.SectorId,
TO_INTEGER(FlattenNewOrder.Qty),
TimeToTimeBucket(FlattenNewOrder.TransactTime)
FROM
FlattenNewOrder,
TickerToSectorMap
WHERE
TickerToSectorMap.Ticker = FlattenNewOrder.Ticker
AND TimeToTimeBucket(FlattenNewOrder.TransactTime) >= LastTimeSlice;


Now that we have a list of orders that occur for the current timeslice, we need to know when a new timeslice occurs. At this point, we need to analyze the orders for the current timeslice, find out which sectors are showing abnormal activity, and clear out the TickerAndSector window so that new orders can be accumulated for the new timeslice.

----------------------------------------------------------------------------------
-- The OrdersPerSectorPerMinute window contains the aggregated totals
-- for each sector for the previous timeslice. The aggregated totals include
-- the number of orders for each sector and the total number of shares for each sector.
--
-- The interesting part of this is the join between the TickerAndSector window
-- and the stream_Pulse. The stream_Pulse will be triggered when we see a new
-- timeslice.
--
-- When we insert rows into the OrdersPerSectorPerMinute window, we will trigger
-- a deletion of the old info in the TickerAndSector window.
----------------------------------------------------------------------------------
CREATE WINDOW OrdersPerSectorPerMinute
SCHEMA (SectorName STRING, SectorId INTEGER, OrderCount INTEGER, TotalShares INTEGER, Timeslice INTEGER)
KEEP 2 MINUTES;

INSERT INTO OrdersPerSectorPerMinute
SELECT
tas.SectorName, tas.SectorId, COUNT(*), SUM(tas.Shares), stream_Pulse.epoch
FROM
TickerAndSector tas, stream_Pulse
GROUP BY
tas.SectorId;

ON OrdersPerSectorPerMinute
DELETE FROM TickerAndSector
WHERE TransactTimeBucket < LastTimeSlice;

As you can see from the above code, when a new timeslice appears, we aggregate the number of orders and the total number of shares that are in the TickerAndSector window. The interesting thing here, and the thing that I might not have figured out on my own, was that we need to join with the pulse stream that we talked about before. The pulse stream here is being used to “kick start” the calculating and dumping of the records in the current timeslice.

Finally, since we have aggregated the information for each sector for the current timeslice, we want to see if any sector exceeded the maximum “normal” number of orders.

----------------------------------------------------------------------------------
-- This output stream will alert the user when a sector exceeds the
-- max orders for that timeslice.
----------------------------------------------------------------------------------
INSERT INTO AlertStream
SELECT
R.SectorId, R.SectorName, R.OrderCount, R.TotalShares
FROM
OrdersPerSectorPerMinute AS R, NormalOrdersPerSectorPerTimeslice AS H
WHERE
R.SectorId = H.SectorId AND R.Timeslice = H.Timeslice AND R.OrderCount > H.MaxOrders;

And, that’s it! If we attach a JMS output adapter to the AlertStream, we can generate a new, derived event, put that event back on the EMS bus (or we can send it into another Coral8 stream), and alert some kind of monitoring application.

Thanks to the Coral8 guys for helping me slog my way through the learning process.

©2007 Marc Adler - All Rights Reserved

Tuesday, November 20, 2007

Our First CEP Use Case (and thoughts on Coral8 and Aleri)

For the Complex Event Processing (CEP) engine evaluation, we have chosen a very simple use case. This use case is:

Tell us when orders for a sector show a greater-than-normal level.

Even though this use case seems very simplistic, and would not tend to be an ideal use case to test a CEP engine, it is an ideal use case for our environment. Why? It forces us to get at various data streams that have previously been inaccessible to most people, and it forces the owners of these streams of data to make there data clean.

(Note: this use case is a very generic use case and test for CEP. I am not giving away any special use cases that would give my company a competitve edge, not will I ever do so in this blog.)

At the Gartner CEP Summit last September, Mary Knox of Gartner mentioned that one of the obstacles for doing successful CEP projects at large organization was the process of liberating all of the data sources that you need, and getting the various silos to talk to each other. We have found this to be the case at our organization too. We figure that if we can get this simple use case to work, then we have won 50% of the battle.

What kind of data do we need to implement this use case?




  • We need to tap into the real-time order flow. Order flow comes to us through FIX messages, and for older systems, through proprietary messages that will one day be deprecated. Luckily, we have found a system that provides us this information. Although this system is a monitoring GUI, we have identified its importance to our company, and we are working with the product owner to split his app into a subscribable order service and a thinner GUI.
  • We need historical order data in order to determine what “normal activity” is for a sector. Luckily, we have this data, and we are in the process of getting access to it. We also need to understand what we mean by “abnormal activity”? Does this mean “2 standard deviations above the 30-day moving average for a sector”?
  • We need to be able to get a list of sectors, and for each order, we need to map each ticker symbol to its sector. Sectors are signified by something called GIC codes, and there are 4 levels of GIC’s. The important thing that we need is to ensure that all corporate actions get percolated down to these mapping tables. So, if a company changes it ticker symbol (like SUNW to JAVA), then the new ticker symbol needs to be automatically added to these mapping tables.


  • Let’s say that we are able to get all of the data that we need, and that the stream of data is pristine. We have to get it into the CEP engine for analysis.

    If you think if writing a normal, procedural program (i.e.: a C# app) to do this analysis, the steps are pretty easy.

    1) Read in all of the reference data. This includes the ticker-to-sector mappings and the list of normal activity per sector per time-slice. We will consider a timeslice to be a one-minute interval. In a 6.5 hour trading day, there are 390 minutes. There are also 11 “GIC0” sectors. So, a timeslice will be an integer from 0 to 389.

    2) Subscribe to a stream of FIX orders.

    3) As each order comes in, extract the ticker and map it to a sector. We are also interested in the number of shares in the order and the time that the order was placed. For each order, increment a running total for that sector and for that timeslice.

    4) Any orders that come in that are past the current timeslice are ignored. Also, any orders that come outside of the normal trading day are ignored. This way, we don’t consider any orders that may have been delayed through our systems.

    5) If we detect a new and later timeslice, then examine all of the sectors for the previous timeslice. If any of the sectors show heightened activity, then alert the user. Then, clear the totals for all of the sectors, and start accumulating new totals for all of the sectors.

    This looks pretty easy. I would assign this to a good C# developer, and hope to get a finished program in one or two days.

    Now, the task is to map this into a CEP engine.

    Most of the CEP engines have a language that is based on SQL. So, you can imagine all of the processing steps above passing through multiple streams in the CEP engine. For step 1) above, we would have two input streams, one for the ticker-to-sector mapping data and the other for the “normal sector activity” data. You can imagine two simple SELECT statements in SQL that read this data from some external database, and construct two in-memory tables in the CEP engine.

    For step 2, you need to write a specialized input adapter that subscribes to a communications channel (sockets or JMS) and reads and decodes the FIX orders. Most orders come through as NewOrderSingle messages (FIX message type = ‘D’). There are various versions of FIX, but let’s say that everything comes in as FIX 4.2 messages.

    Most of the CEP vendors support in-process and out-of-process adapters. In-process adapters are faster than out-of-process adapters, but out-of-process adapters are usually easier to write. An out-of-process adapter will read data from some kind of communications bus (or even from a database table or a flat file), and will write a data stream to the CEP engine. It would be ideal to have the CEP vendors support FIX in in-process input and output adapters.

    Step 4) is easy. We calculate the 0-based timeslice for an order, and if it is below 0 or above 389, then we ignore this order in the stream. This can be done with a simple WHERE clause in the SQL statement.

    We also need to record the “current timeslice” and ignore any orders that come before the current timeslice. So, we need the concept of a “global variable” and when we see an order with a later timeslice, we need to update this variable. This is something which is easy to do with a procedural language, but what is the best way to do this in SQL?

    Steps 3) and 5) are interesting. We need to keep a one minute window per sector. This window should only keep running totals for the current timeslice. When a new timeslice comes in, we need to analyze the sector activity in the current timeslice, do any alerts, and then clear out the totals in all sectors. Again, this is something that is extremely easy to do in a C# application, but translating it into SQL is a bit of a challenge.

    In step 3), the mapping of ticker to sector is very easy. It’s just a join of the ticker in the order with the ticker in the mapping table. The interesting thing is the choice of window type for the stream. Do we accumulate all orders for all sectors for the one-minute timeslice, and then, when we see a new timeslice, do we just take a COUNT() of the number of orders for each sector? Or, do we simple have a window with one row per sector, and keep running totals for each sector as an order comes in?

    Coral8 supports the concepts of sliding and jumping windows. Aleri supports only sliding windows right now. With Coral8, we can set a window that will hold one minute’s worth of data, and we can also tell a stream that it should dump its output after one minute. However, we don’t want to tie the TransactTime in a FIX order message to the actual clock on the computer. We need a stream that will produce output on a certain value in a column, and neither Coral8 nor Aleri seem to have this yet.

    Here is some Coral8 code that shows windows and streams:

    CREATE WINDOW TickerAndSector
    SCHEMA (Ticker STRING, Sector STRING, SectorId INTEGER, Shares INTEGER,
    TransactTimeBucket INTEGER)
    KEEP EVERY 60 SECONDS;

    INSERT INTO TickerAndSector
    SELECT
    FlattenNewOrder.Ticker,
    TickerToSectorMap.SectorName,
    TickerToSectorMap.SectorId,
    TO_INTEGER(FlattenNewOrder.Qty),
    TimeToTimeBucket(FlattenNewOrder.TransactTime, 'HH:MI:SS AM')
    FROM
    FlattenNewOrder,
    TickerToSectorMap
    WHERE
    TickerToSectorMap.Ticker = FlattenNewOrder.Ticker
    OUTPUT EVERY 60 SECONDS;

    The first statement defines a window that keeps one minute’s worth of order data. After one minute, the window will empty its contents.

    The second statement will insert a new row into the window whenever we get a new order. After one minute, the window will send its output to another stream further down the pipeline. (We hope that the data will be sent to the next stream before the window clears itself. Otherwise, we will lose all of the data.)

    So far, in my brief evaluation, I have found step 5) difficult to implement in Coral8. Aleri has implemented this by using a FlexStream. A FlexStream is a stream that has procedural logic attached to it. Aleri has a custom C-like programming language that you can use to implement procedural logic in a FlexStream. But, if you write too much logic using FlexStreams, then wouldn’t you be better off to just write a nice C# application?

    To validate some of the CEP engines, I ended up taking a day and writing a C# application that implements this use-case. For grins, I added a tab that showed some animated graphics using the very excellent ChartFX package. The head of the trading business was so excited by this eye candy that he started to bring over various traders for a look at my simple app. So, in addition to this little app giving the traders information that they did not have before, it provided them a flashy way to see real-time movement across sectors.

    In addition to having SQL skills, a good CEP developer needs to readjust their way of thinking in order to consider pipelined streams of SQL processing. There is a big debate going on in the Yahoo CEP forum as to whether SQL is a suitable language for CEP processing. So far, with this use case, I see the suitability of SQL, but I also need to step out of the SQL way of thinking and apply some procedural logic.

    One of the things that I still need to be convinced of is that CEP engines can do a better job than custom code. I am all ears. Any CEP vendor (even Streambase) is invited to submit public comments to this blog to tell me how this use case can be implemented with their system.


    ©2007 Marc Adler - All Rights Reserved

    Saturday, November 17, 2007

    CEP Vendor Thoughts

    Recently, I came across an article on Streambase in Windows in Financial Services magazine. One of the questions to the head of Streambase went like this:

    WFS: Does StreamBase have any competitors?

    BM: The major players have not yet delivered anything in this space. IBM, for example, does not have a project to build a technology like this. We are IBM’s solution in this space.

    In my opinion, this answer totally evades the question. What happened to companies like Aleri, Coral8, Esper, Apama, Skyler, Truviso, Kaskad, etc? How about the IBM offering that Opher is working on? Alll of these companies freely acknowledge Streambase as a worthy competitor, and rightly so. It would be nice to see Streambase acknowledge the same. Brown University certainly was not the only university doing CEP research and not the only one to commercialize their offerings.

    And shame on Microsoft and Windows in Financial Services magazine for letting this slip by. Are you a journalistic effort or a fluff rag?

    In our evaluation of CEP vendors, we chose not to evaluate Streambase for various reasons. Streambase might have the best technology of all of the CEP vendors (for example, look at Tibbets comment from a few weeks ago on a question about cancelling events), but we will never get to find out. The people who I feel badly for at Streambase are the dedicated development and support staff who have probably come up with a really good product.

    (In the interest of fairness, Bill from Streambase told me recently that they had reduced the price of their offering, which was one of our concerns.)

    And, if anybody from Streambase reads this blog ---- doing an end-run around me and trying to market directly to the business will not earn you any points. The business people rely on me to make the right decision, and all of your email to the business side (as is any email from information technology vendors to the business side) gets forwarded directly to me. And, I guess that we will end up paying real dollars to your imaginary competitors.

    Meanwhile, let's take the attitudes of Coral8 and Aleri. One of these companies JUST hired its first salesperson. Their mantra was that the product should be the best that it can be before it was pushed by a salesforce. The other company has a low-key sales approach too. They have gone beyond the call of duty to incorporate our suggestions into their product and to come up with a POC that really impresses us.

    Both vendors have come up with FIX input adapters at our behest. Aleri has incorporated some of our suggestions into their FlexStreams, and has cleaned up some of their visual development studio. (With FlexStreams, you can use a procedural programming language to create custom processing for streams). I am impressed in what these companies have done to earn our business. I feel that, in exchange for these companies doing some of what we want, they get to expand their offerings for the capital markets communities, and bring themselves out of the narrow focus of algorithmic trading and pricing engines.

    Kudos to Mark, John, Henry and Gary of Coral8, and to Don, John, Jerry, Jon, David, etc of Aleri. All very nice people, and all trying compete honestly for a piece of the pie.

    In my opinion, the Coral8 and Aleri offerings are so close that we will eventually be choosing one vendor as primary and the other as hot backup. What needs to be done is performance evaluation. Pushing multiple streams of fast moving data into the CEP engine and seeing their performance under heavy load. Let's see if they can handle the data rates that come at 2:15 PM on a Fed decision day.

    One message that we have been hearing from the CEP and messaging vendors is that they perform better under Linux than Windows Server 2003. This is probably not a surprise to most people on Wall Street. But, I wonder what Windows Server 2008 has to offer in comparison to Linux. The November 8, 2007 article at Enhyper has some interesting things to say about Microsoft's marketing of the London Stock Exchange deal. We will most likely be running our CEP engine on Linux unless Microsoft comes up with a real compelling reason to the contrary.


    ©2007 Marc Adler - All Rights Reserved

    Wednesday, November 07, 2007

    IBM's ManyEyes

    (Thanks to Jules)

    ManyEyes is a community-based visualization project from IBM, run by the guy who did the Stock Market heatmaps for Smart Money.

    You can upload your own datasets and apply some pre-made visualizations to it. People in the community have contributed other visualizations.

    Pretty cool.

    ©2007 Marc Adler - All Rights Reserved

    Algorithm for Implementing Treemaps

    http://www.win.tue.nl/~vanwijk/stm.pdf

    ©2007 Marc Adler - All Rights Reserved

    Sunday, November 04, 2007

    Help Wanted for the Complex Event Processing Project

    I have open headcount for about 4 or 5 people for 2008 for the Complex Event Processing project that I am running.

    I realize that it is foolhardy to advertise for people who have prior experience in CEP. What I am looking for are smart developers who have a great passion to learn a new, interesting technology. The team that I envision will consist of:

    1) Visualization developer - come up with new, interesting ways to visual events and data. The work may entail working with the .NET framework that my team has built, integrating visualizations with existing Java/Swing-based trader GUIs, or even exploring WPF (as the company gradually embraced .NET 3.x). You could be investigating visualization tools like heatmaps and you will definitely be evaluating third-party tools (both commercial and open-source). You will be involved in OLAP to some extent. There will be involvement in the building out of a notification and alerting framework.

    2) CEP developer. You will be building out the analysis part inside the CEP engine. Most of the CEP engines use a variant of SQL, so you should be fairly comfortable with SQL concepts. It would be nice if you had previous experience with tools like Coral8, Aleri, Streambase, Esper, etc, but even if you haven't, you should be willing to learn these tools. You may also be interacting with consultants from these companies.

    3) Networking, messaging, and market data specialist. Help us decide if we should migrate to a new messaging infrastructure (like RTI or 29West). Experience with Tibco EMS is a big plus, as well as experience with working with high volumes of data and low latency. Interact with Reuters and Wombat infrastructures, as well as internally-built market data infrastructures.

    4) Data specialist. You will be the person who is responsible for breaking down silos and getting good data into the CEP engine. Experience with SQL Server 2005 and Sybase are important. Experience with tick databases like KDB+ and Vhayu would be nice to have.

    Everyone will be doing a bit of everything, so everyone on this team will be intimately aware of what everyone else is doing.

    This is a highly-visible position in an investment bank that has promised me that they will reward good talent who comes to us from the outside.

    In addition to the positions mentioned above, I have two or three open headcount for people who want to work on the Ventana team. Ventana is the client-side .NET framework that is being used by various groups in our Investment Bank.

    ©2007 Marc Adler - All Rights Reserved

    Saturday, October 06, 2007

    The Consulting Market and More Thoughts

    Comments from various contacts in the consulting industry lead me to believe that there is a slowdown in financial IT consulting, with benches growing at various consulting companies around the globe. At the same time, there is a hiring freeze going on at many Wall Street companies, with exceptions being made only for outstanding candidates.

    A lot probably has to do with the subprime mess and the incredible losses that many Wall Street companies have announced. Bonus season is going to be very "challenging" this year.

    My company will be making a concerted effort to recruit outstanding IT people from outside. We realize that new blood is the way to shake things up, and transform our shop into one of the major IT houses on Wall Street. The business side has been taken over by people who have CompSci degrees from top universities, and this new regime realizes that tech will drive future profitability.

    Despite this, I have open headcount for 2008 for the Complex Event Processing initiative. When I get back from my week in France, I will post a general call for help. If you have a driving interest in CEP, and you can code in C#, Java, and C++, and you know how to deal with Linux and Windows, then I want to speak to you. I think that CEP will provide a fundamental change in the way we trade and the way that apps are written and the way that data is visualized.


    ©2007 Marc Adler - All Rights Reserved

    Saturday, September 29, 2007

    Cancelling Events

    http://blogs.streamsql.org/streamsql/2006/08/handling_revisi.html

    Mitch Cherniak poses a question that I posed a few weeks ago about "compensating events".

    Mitch seems to be involved in that same project that produced Streambase. I am not sure if Cherniak is involved with Streambase in any way, but it would be interesting in the Streambase folks have thought about this same topic.

    ©2007 Marc Adler - All Rights Reserved

    Wednesday, September 26, 2007

    A New Quant Blog

    http://4quants.blogspot.com/

    ©2007 Marc Adler - All Rights Reserved

    Tuesday, September 25, 2007

    OneTick

    Yet another tick database, to compete with KDB+ and Vhayu. Plus, they just inked a deal with Wombat.

    Looks like more mad Russian scietists that escaped from Goldman.

    ©2007 Marc Adler - All Rights Reserved

    Francis and Karen

    My colleague Francis and his young wife are involved now in the battle of a lifetime. It makes the daily bullcrap that we put up with at work seem very trivial, and makes decisions like "should we use DDS or JMS" seem inconsequential.

    My prayers and thoughts are with them daily.

    Francis has written a WROX book that is available from Amazon.


    ©2007 Marc Adler - All Rights Reserved

    DDS vs JMS

    Data Distribution Service (DDS) is an alternative to JMS. This spec was designed and is supported by OMG (Object Management Group). The concepts are fairly familiar to anyone who has done JMS.

    RTI implements DDS in their message bus.

    My feeling is that getting a large enterprise to use DDS is akin to rewiring a battleship. However, it seems like DDS has some possible performance improvements over JMS.

    Anybody out there using DDS?


    ©2007 Marc Adler - All Rights Reserved

    Saturday, September 22, 2007

    A New Financial IT Blog

    Tom Steinthal's blog is here

    ©2007 Marc Adler - All Rights Reserved

    Nancy

    I will away the week of October 7th. I will be travelling to Paris and then to Nancy (France) for the annual Nancy Jazz Pulsations festival. The purpose of my trip is to see a 2-night concert by my favorite band, Magma. The famous French bassist, Jannick Top, will be performing with Magma, and for me, this recreation of their famous 1974 line-up is reason enough for me to dip into my AMEX points....

    ©2007 Marc Adler - All Rights Reserved

    Coral8 vs Streambase

    http://www.dbms2.com/2007/08/10/coral8-versus-streambase/

    Interesting reply by Bill of Streambase .....

    However, I think that the only way that this will be solved publically is by STAC and their independent benchmarks.



    ©2007 Marc Adler - All Rights Reserved

    CEP Conference Report

    I thought that the CEP conference would be a sleepy little symposium, but I was wrong.

    Gartner ran their conference on Business Process Monitoring on the Monday-Wednesday morning, and it dovetailed exactly into the CEP conference. There was also an academic conference on CEP that ran concurrently with the BPM conference, so you had a lot of academics and curiosity-seekers at the CEP conference.

    I would estimate that there were about 200 attendees at the CEP conference, but probably half were from the vendor community. I counted about 50-60 attendees in the financial services track at the conference, and again, about 40-50% were vendors. So, there were probably about 30 folks from the financial services industry who were really curious about CEP.

    I give these numbers to illustrate that, in financial services, CEP is still in the curiousity stage. I am happy to say that most of the people there were just as confused as we were (ie: what is the difference between a CEP engine and a rules engine, what is the best messaging middleware to use, do we really need a visual designer, etc). Most of the people who were using CEP were using it in a simple stream case ... pricing, algos, fraud detection, etc. I did not see any use cases that approach the magnitude of what we have to do, but then again, I don't expect organizations like Goldman Sachs to advertise what they are doing.

    It was nice to see some of the vendors trotting out their successes in financial services (Aleri with HSBC, CommerzBank and Barcap, Streambase with BofA, Coral8 with Wombat), and start to establish some meaningful partnerships. Nice to see that Coral8 is working closely with RTI.

    Mary Knox of Gartner did a nice presentation on what it takes to get CEP adopted by a large financial organization, but she compensated for this nice presentation by monopolizing the Q&A session with Robert Almgren of BofA. I was a bit puzzled by IBM's message about what parts of their CEP framework were already available in Message Broker, who was supporting it in IBM, who we get services from, etc.

    I was impressed with Mark from Coral8, who seems to bring that Russian mad scientist mentality to his company. Also encouraging to see that Terry Cunningham is at the helm at Coral8, fresh from his successes at Crystal and other companies. Terry knows what it takes to make a successful software company. They also seem to be on the top of their game with financial services, without restricting themselves solely to algo trading.

    Aleri seems like they have some raw power in their platform, and over time, they will be polishing their message more. They win the award for the most genial vendors of the conference.

    The key takeaways were:

    1) The CEP industry is still at its infancy.
    2) It's a fairly hot buzzword right now.
    3) The complete stack is important if you do large deployments. We need to understand all of the pieces surrounding the CEP engine.
    4) Message volumes are increasing at an exponential rate in financial services.
    5) It will be a long battle to break all of the silos in our company.


    ©2007 Marc Adler - All Rights Reserved

    Tuesday, September 18, 2007

    CEP Vendors, DBMS's and Defense Industry

    I made a comment in my last posting about RTI, a small messaging and CEP vendor. It seems that RTI got its start in the defense industry. Coincidentally, IBM is allowed to mention the fact that its System S arose out of a 4-year defense contract.

    CEP vendors who grew out of the DBMS sector might have certain advantages, but ones who grew out of defense have other advantages. First, you would expect that anything that had to pass Military testing and certification would be industrial strength. These systems would have to be able to handle a huge throughput of signals and would have to be able to react correctly to the input .. after all, you don't want our ships firing a missle at a seagull. Second, the analysis tools have to be top-rate ... detecting a signal and firing a missle is much like pulling the trigger on a trade. Both have enormous consequences, and once out there, cannot be retracted.

    I am not completely positive, but I can imagine that messaging systems in the defense industry might not have to worry about guaranteed delivery. It seems to be OK to drop a few signals (events messages) that the radars send out, and eventually (a few milli-seconds later), the signals would resume.


    ©2007 Marc Adler - All Rights Reserved

    Thoughts about CEP Patterns

    One of the things that I have been wondering about:

    How do you do transactions in the context of event processing? For example, if I get an event, we might trigger some units-of-work and generate some more derived events from this initial event. These derived events might be put into the event cloud.

    If one of the units-of-work fail, how do we rollback the "transaction"? Do we need to define compensating events? Do we need to suck the derived events out of the event cloud? Do we need to attach "state" to the derived events which signal whether they are part of a transaction?

    We run into a lot of the same issues here that DBMS vendors have run into. How can the transactions patterns in the DBMS world be applied to CEP?

    Another thing I have been interested in is the generation of recursive events. If I get an event and generate a derived event, how do I know that this derived event will not result in recursion? Again, DBMS vendors have run into the same situation with database triggers, and some lessons can be learned.

    So, does this mean that we are better off considering a CEP vendor who has roots in the DBMS world?

    (As you can see, I am just using the blog as scatchpad for some thoughts .... but it gives you an idea about the things that any of you who are implementing CEP might have to consider...)


    ©2007 Marc Adler - All Rights Reserved

    HPC Conference Thoughts

    Yesterday's HPC On Wall Street conference could be summed up in one word .... PACKED! I thought that this would be a sleepy little conference, with a few crazy architects from a few IBs and hedge funds walking around. When I got there, I could not even move through the (very narrow) aisles that the vendors were set up in. I would guess that there must have been 700-800 people at this conference, and it is at these times that I am thankful that I am not a lightweight, as I merrily tossed my compatriots all over the Roosevelt Hotel in my quest to get to the conference rooms.

    The two panels that I went to were, unfortunately, complete wastes of time. Even some of the panelists that I spoke to afterwards were half-embarassed to be on the folically-challenged panels. (I am still wondering what Gideon Low from Gemstone was doing up there in the session on Multicores and Market Data .. although he DID manage to come up with better answers than the Chief Architect of Fidelity!). It only reinforces my opinion that if you are doing something unique and important, your company will never let you speak about it on a panel!

    And, thanks to the people from Bear Stearns, who during our lunch with a vendor, planted themselves at our lunch table and managed to kill our conversations.

    I was heartened to see a constant flow of traffic at the Digipede booth. In several square meters, you could talk to Microsoft (CCS), Digipede, and Platform (who were displaying at the IBM booth). I was disappointed to see one CEP vendor there (Aleri).

    The most interesting new player was STAC, who purports to be an "independent" kind of lab, where both vendors and customers can go to in order to get evaluations and benchmarking done .... sort of like a Consumer Reports magazine. One interesting comment from them was that, in their early experience, some of the claims about feature sets that various vendors make do not actually hold up when it came time to implement the vendors' products. An independent lab for the securities industry is long overdue, and it is heartening to see vendors contribute hardware to STAC in order to "bootstrap" their operations. The two main guys from STAC are ex-Reuters guys, and so you know that they have good experience in market data systems.

    There were some real small players (like RTI in the messaging space) who might be ripe for joint ventures or investment. Seeing these guys work the floor reminded me of my days as a software vendors going exhibiting at shows ... an absolutely exhausting experience.

    My colleagues have told me that the HPC show has grown exponentially over the past four years .... if I grows any more, then the Roosevelt Hotel is going to have to abandon the cattle car approach, or else, I am going to have to put on a few more pounds!


    ©2007 Marc Adler - All Rights Reserved