Skip to content

Mark Logic CEO Blog
Syndicate content
Formerly The Mark Logic CEO Blog, this blog is written by Dave Kellogg, CEO of Mark Logic Corporation, covering next-generation database management, enterprise search, and content management technologies along with commentary on Silicon Valley, venture capital, and the business of software.
Updated: 20 hours 27 min ago

EMC Acquires Data Warehouse Vendor Greenplum; Creates New “Data Computing” Product Division

Wed, 07/07/2010 - 00:06

See EMC’s press release on the deal here.  First, some takeaways from the press release and related coverage:

  • All cash transaction, valuation undisclosed.  See below for some fun and math, trying to guestimate it from standard ratios.
  • Greenplum had raised $61M in venture capital.
  • EMC intends to create a new “data computing” product division and to have Greenplum CEO Bill Cook run it, reporting to Pat Gelsinger.
  • This the second of the specialty data warehouse DBMS vendors to get acquired.   Microsoft acquired Datallegro in 2008 at a rumored valuation of $250M.
  • Greenplum was ranked a visionary in Gartner’s Data Warehouse DBMS magic quadrant in January.  They were positioned about 70% on vision and about 49% on execution.   The leaders were Teradata, Oracle, IBM, Netezza, Microsoft, and Sybase.
  • Greenplum’s CEO and two co-founders have posted an open letter to customers and partners which argues that EMC is “uniquely positioned to dramatically accelerate the Greenplum vision of building the enterprise data system of the future.”
  • In addition to their DBMS, Greenplum offered an “enterprise data cloud platform” called Chorus, which includes something called the Greenplum Data Hypervisor.
  • This Wall Street Journal article quotes EMC talking about “great synergies” between Greenplum and VMware which to me are non-obvious.  Perhaps they’re related to the prior point.
  • EMC will continue to offer Greenplum’s full product portfolio to customers
  • Note this, buried at the end of the press release:  EMC plans to deliver new EMC Proven reference architectures as well as an integrated hardware and software offering designed to improve performance and drive down implementation costs.  Pretty clearly, this says a data warehouse machine/appliance is coming.

So what does all this mean?

  • That storage vendors are going to continue to move up the food chain.  EMC has done a slew of acquisitions — Greenplum looks to be its 53rd — and I expect that to continue.  Storage itself is changing as it continues to include more networking and memory technology.  But storage vendors are changing too, not content to get stuck in the commoditization trap.
  • That yet another type of vendor is now attacking the database market.  In addition to (1) a slew of startups focused on specific niches, we now have (2) SAP via its acquisition of Sybase, and (3) now EMC via Greenplum attacking different segments of the ~$15B/year database market.  The big three oligopoly should not sleep too soundly at night.
  • With my Aster Data board hat on, I’d say that EMC is only getting part of the picture.  Basic data warehousing on big data is only part of the equation.  What customers ultimately want to do with big data is analyze it, and that requires the high-performance execution of complex analytics on big data — something that Aster Data does uniquely well.  Most of the data warehouse DBMS market is focused simply on reducing the price/performance for basic data warehousing.  To my knowledge, only Aster Data is focused on that plus enabling complex, high-end analytics.

Here’s my estimate on the valuation range.  This is based on math, guesswork, intuition, and standard ratios.

  • LinkedIn says Greenplum has about 130-140 people.
  • Enterprise software company revenue often runs about $250K to $350K/employee.
  • This implies revenues of $30M to $50M.
  • Software companies typically sell for 1-2x revenues when they’re in trouble, 2-3x revenues when they’re plodding along, and 8-10x revenues when things are hot.
  • Netezza, for example, currently trades at 4x revenues.  (But remember, that’s to buy one share.  If you want to buy them all, you’ll have to pay a premium.)
  • Greenplum, to my knowledge was doing pretty well.  Let’s take 5-8x as my guess on the revenue multiple.
  • This implies a valuation range of $150M to $400M.
  • It’s hard to imagine that their last funding of $27M in January 2008 was done at anything less than $100M post-money, and possibly a fair bit more.
  • This, in turn, implies that no VC would want a 1x return over 2+ years for a company that was doing well.  If true, this wipes out the low end of the valuation range.
  • This leaves me estimating the valuation at somewhere between $300M to $400M.

Bear in mind that it doesn’t take much to swing these numbers because they are built by multiplying estimates and ranges.  A few changes here or there and I can $200M.  Or I can get $500M.  My real hope is that I have enough offsetting errors that I end up close to the right answer!  If I get new information that either changes my estimates or simply provides the facts, then I will try to update this post and share it.

Categories: Companies

MarkLogic: NoSQL Before NoSQL Was Cool

Mon, 06/28/2010 - 21:00

Long-term database guy and MarkLogic VP of Engineering Ron Avnur said that at our last user conference that MarkLogic was “NoSQL before NoSQL was cool.”  He even made up about 500 t-shirts with that slogan on them and handed them out.  See Ron if you want a t-shirt.  See this post if you want my analysis of his statement.

Let’s look first at what MarkLogic is about:

  1. Unstructured data.  This means not only dealing with data in odd structures (e.g., sparse and/or semi-structured data), but also handling words and all the challenges that go with them.
  2. Scaling on cheap hardware.  In effect, scaling like Google, using racks of inexpensive pizza boxes instead of big, expensive computers with expensive SANs attached.  This is accomplished via shared-nothing clustering.
  3. A non-relational data model.  MarkLogic Server uses the XML data model.
  4. Document-orientation.  MarkLogic is a document-oriented system, meaning that the fundamental modeling unit is the (XML) document and that the system includes search functionality, in the same way that a smartphone includes a GPS.
  5. Ad hoc queries.  A reductionist mission statement for MarkLogic Server is “to perform database-style queries on unstructured information.”  (See diagram below.)
  6. Standard interfaces.  We believe in standard interfaces, in part because it’s in our self-interest to do so.  Standards help de-risk the purchase of new technologies from high-growth vendors.  We support a number of W3C standards XQuery, XPath, XML, xHTML, XPointer, and coming soon, XSLT.
  7. ACID transactions.  We’re database guys.  While we’ll let you turn off the transaction system and are in the midst of implementing replication with a consistency dial, by default we do ACID.

One interesting way to summarize these points is to build a database for the Internet era.

If you look at what I’ll call the “new NoSQL systems” are about, you’ll see:

  1. Unstructured data, dealing with tweets, updates, graphs, and profiles more than information that fits nicely into tables.  Interestingly, despite this, most of the new NoSQL systems do not integrally include search.
  2. Scaling on cheap hardware.  In this case ranging from magnitude 100 to magnitude 100,000 nodes.
  3. A non-relational data model.  Various new NoSQL systems are key/value oriented, document-oriented, or graph oriented.
  4. Document-orientation.  While not all new NoSQL systems are document-oriented, some NoSQL systems (e.g., MongoDB, CouchDB) are.
  5. Primary key look-ups.  The new NoSQL systems are not about ad hoc queries.  See this  (profanity warning) cartoon, which pretty  much says it all.  Some of the new NoSQL systems allow secondary indexes, but one way key/value stores go fast is specifically by disallowing additional indexes beyond the primary key.
  6. The new NoSQL systems are simply too new (most are version O.XX.XX) to have evolved to standard interfaces.  This means that an application built for Hadoop is not easily moved to Cassandra.  Don’t confuse “open source” (i.e., you can see the source code) with “standard API.”
  7. BASE transactions.  The new NoSQL systems trade traditional database consistency for 1,000 to 100,000 node scaling and performance.  In the end, who cares if you lose my “Dave’s in aisle 7 at Safeway” tweet?

If you look at the overlap, you’ll see that MarkLogic and the new NoSQL systems overlap on points 1, 2, 3, and 4.  While we certainly can provide key/value storage (think of two-element schema <key>key-id</key> and <value>value-payload</value>), we provide considerably more index options and generally perform faster using our own search-engine style indexing than b-tree indexing used by both traditional RDBMSs and the new NoSQL b-tree indexing.

On point 6, I think we have a huge advantage, not only because we are using standards, we are using Internet standards from the W3C.

On point 7, as mentioned, we are largely ACID people, but we are working to give users more flexibility to lower the ACID bar in order trade for other variables.

All in all, I agree with Ron.  I think MarkLogic was NoSQL before NoSQL was cool.

Categories: Companies

Seeing Both Sides of an Issue

Mon, 06/28/2010 - 16:15

The ability to see both sides of an issue is a critical executive skill.  Yet, in typical corporate America culture, that skill is all too often lost.  Why?

  • Things get partisan:  sales wants X, marketing wants Y, finance wants Z.
  • Discussions turn blame-oriented.  Instead of working to solve problems, people work to avoid blame.
  • Managers lose interest in understanding the alternative positions.
  • People don’t listen to each other, often because they’re too busy thinking of what they’re going to say next.  (Resulting in what one friend calls “parallel independent conversations.”)

The solution is to force managers to articulate both sides of important issues.  If a person is advocating thing X instead of thing Y, I want them to be able to clearly and convincing explain the advantages of both.  The best decisions come, in my opinion, when you hold two opposing ideas in your mind at once, and then choose.

When done correctly, you will see:

  • A focus on solutions, not blame.  Example:  “help me understand how you want to solve the problem.”
  • Managers looking forward, not back.  This flows naturally from the prior point.
  • Managers practicing active listening, a great technique for trying to understand the other person’s point of view.  Example:  “so, Ted, you’re telling me that you think we’re doing too many tradeshows that result in poor quality leads — is that correct?”

But seeing both sides of an issue only gets you halfway to your goal.   In many big companies, the unintended dysfunctional consequence of doing so is passivity and fence sitting:

  • Well,we could do A or we could do B.  Frankly, I’m open.
  • The consensus in the meeting was that both A and B were good options.  (This hits my “launch” button!)
  • Well, there are certainly advantages and disadvantages to both options.
  • We should pick the option that keeps the most other options option.  (Also known as The MBA Credo).

Somewhere along the way in corporate America, managers forgot that they are paid to make decisions.  The point of seeing both sides isn’t to avoid decision making.  The point is to make better decisions.

To ensure a focus on decisions, I usually run a line of questioning that starts with the decision and backs up from there.

  • What do you think we should do?  (And push for a single answer)
  • Why do you think we should do it?
  • Why should we do the alternative?

If you’re already performing these techniques, great.  If you’re not, give them a try and let me know how it works.

Categories: Companies

Beware the Spectacular B-Round Valuation

Sat, 06/26/2010 - 17:20

Visualization tools startup Palantir announced a follow-on financing round yesterday, raising $90M at a claimed $735M valuation.  Since most people aren’t familiar with either finance or VC math, this can generate confusion so I thought I’d do a post explaining a few things.

The first is simple:  do not confuse valuation with revenue.  Valuation (or for public companies, market capitalization) is an implied metric based on per-share price and number of shares outstanding.  For example, a public company with 50M shares and a $20 share price has a valuation of $1B.  That alone says nothing about its revenue.   TechCrunch makes this mistake three times in the story, calling Palantir “the next billion-dollar company” in the headline, saying they’re a “near-billion dollar company” in the middle,  and at the end, saying they are close:

It’s hard to imagine a billion-dollar company without a sales team, but then again Palantir is getting pretty darn close.

This is simply not true.  By my guess, Palantir is doing somewhere between $25M and $50M in GAAP revenues — nowhere near $1B.  Furthermore, while I hate to be technical, I could easily believe they are doing less:  as I understand their model, recognizing GAAP revenues should be a nightmare — e.g., calling all field staff engineers and claiming no services business implies field-based R&D implying the need to defer revenues until product completion for a given customer.

The second confusion is more subtle and relates to a quirk in VC math that makes an early round investor, who believes in the company and has cash to put to work, valuation neutral on subsequent financing rounds.  In fact, you could argue that they’re not valuation neutral, but positively biased because they mark their existing shares to the new valuation when reporting back to their limited partners.

Reminder:  I am no longer talking about Palantir in specific because their capital structure is both private and presumably more complicated than I describe here.  I am trying to show, in general terms, how some quirks result in early-round investors liking higher subsequent-round valuations — even when they’re buying shares at those higher prices.

For a quick primer on VC math and terminology, go here.  Now, let’s examine a spreadsheet I built to concretely demonstrate the mechanics of what I’m talking about.

In my example, a hot company manages to raise a $24M A-round at a pre-money valuation of $36M.  This is unattainable for most entrepreneurs, but let’s say you made a lot of money on your last gig and thus have some friends in the venture capital community who believe in you.  Note that as part of this round, VC1 has invariably negotiated himself the right to avoid dilution in subsequent rounds.  Since he owned 40% of the company after the A-round, he thus has the right to purchase 40% of any new shares sold by the company going forward. This is called exercising his pro rata.

Now it comes time to do the B-round.  Let’s say that things are going well and that the company somehow thinks it should be able to raise $30M at a $180M pre-money valuation. That’s scenario I in my spreadsheet.  Let’s see what happens.  (Click to enlarge.)

  • In the B-round, the company sells 5M new shares at $6/share for $30M.
  • VC1 chooses to fully exercise his pro rata and thus buys 2M shares for $12M.
  • That leaves 3M shares for the new investor, VC2, who pays $18M.

Seems like a pretty good deal, but wait. If you’re executing the go-big-or-go-home strategy which both you and VC1 agree is appropriate, then $30M isn’t enough.  You decide you need $90M.  That’s scenario II in my sheet:

  • You issue 15M shares at $6/share to get $90M.
  • VC1 exercises his pro rata and buys 6M shares for $36M.
  • VC2 buys 9M shares for $54M.

Everybody’s happy, but then you look at founders and employees whose ownership has dropped from 60% before the round to only 40% after.  Most people would call this a 33% dilution (20 divided by 60), though some would call it a 20% dilution (60 minus 40).  Either way, while this scenario raises the money needed, the team loses a lot of ownership in the process and doesn’t like that one bit.

Then, the creative type on the team says: “I can solve the problem.”  See scenario III:

Why sell 15M shares at $6 when we can sell about 4.3M shares at $21 to get the same amount of money?  We’re better off, keeping 52% ownership for ourselves, and the great part is VC1 doesn’t care.  No matter the valuation, if we’re raising $90M and if VC1 is exercising his pro rata, then he’s in for $36M– see the boxed cells on the spreadsheet.  All we need to do is to get together with VC1 and find some dumb money willing to pay $54M for 8% of the company.  There’s plenty of dumb money out there these days and if we can’t get it in one investor, then maybe we can build a little consortium of a few.

And while we might view VC1 as valuation-neutral from one perspective, we shouldn’t forget that he has a boss, too.  He reports back a few times / year to his limited partners.  If we do the deal at $630M pre-money, then he can mark up his A-round shares from $24M to $252M in value, showing a 10x paper return to his investors.

I am not saying this has or has not happened with any given company.  I would like to make the important note that the whole notion of “dumb money” is at odds with free market theory.  I’ll also add that I know some quality VCs advise limited partners to ignore investment marks-to-market, but I doubt they all do.  Nevertheless, I hope this story shows that there’s potentially more than meets the eye in the world of venture financing, driven largely by the dual role (owner and seller) played by the existing VCs and founders/employees.

So what do I think it really means when a company announces a big round at a high valuation?  I think it means that:

  • The company is trying to build and/or sustain a hype bubble and wants to be seen as hot.  Most VC-backed companies do not disclose valuations.
  • The company is executing a “go big or go home” strategy that I’d argue increases the risk for its customers. Remember, Amazon went big.  Webvan went home. See the Fit or Fat Startup Debate launched by Ben Horowitz and countered by Fred Wilson for an examination of such strategies from the VC point of view.  In my estimation, sometimes they produce a great result, often a great crater, and rarely a great business.  Ironically, you can get nice exit valuations off such strategies but not great multiples.
  • The company has a supportive A-round investor willing to invest real money and who believes in the go-big strategy.
  • The company intends to spend the money, either because it must in order to sustain the current burn rate or because it wishes to expand into other areas.  The former signals unsustainable situation, the latter signals a potential loss of focus.
  • If things don’t go as the company plans, the dumb-money will put constant pressure on management to be aggressive, reminding everyone of the expectations they bought into.  This can make it hard to back off and change direction in the event of bumps along the way.
  • The company could have trouble exiting at otherwise reasonable valuations, especially if the dumb-money controls a class of stock.  Think:  “I need at least a 2-3x on this investment.”
Categories: Companies

Doping is to Cycling as Poor Officiating is to Soccer

Sat, 06/19/2010 - 18:25

This is a post on marketing as much as sports.  Here’s my logic:

  • If you want to maximize the audience for your sport, and ergo maximize potential revenues, then outcomes need to be fair.  Professional wrestling excepted (which Wikipedia refers to as “a form of sporting theater”), who wants to watch a sport where the outcome is either random, predetermined, or meaningless?
  • Cycling has been ruined as a sport by doping.  Who wants to invest twenty-something days watching the Tour de France, see Floyd Landis win it, and then get stripped of his title a few days later for doping?  It ruins the fun when people are cheating, and as long as people are cheating the results are meaningless.  Who wants to watch sports where the outcomes are meaningless?  Some people, but not me — I haven’t really followed the Tour since 2006 — and not lots of others.  Ergo, the potential audience is not maximized.
  • Long before yesterday’s goal scandal, I have argued that soccer suffers from serious problems with officiating which, among other problems, limits its ability to succeed as a major sport in the USA.  Soccer is a low scoring sport so the impact of blown calls is much larger than in higher-scoring sports.  One blown foul call in a basketball game that ends 110-100 makes little impact.  One blown call in a soccer match that ends 2-2 makes the difference between the USA automatically qualifying for the next round and (basically) being in a win-or-go-home situation on its next match.

The real problem here is FIFA which stubbornly refuses to use technology to solve this problem.  Video replays and ball-sensors are obvious solutions to the problem.  (I’d also argue that soccer should add a fifth referee simply to manage the pushing and shoving in the box on set pieces, much as years ago hockey added a fourth one just to look after nastiness off the play.)  Yet FIFA somehow insists that such things are not in the culture of soccer, which is frankly an idiotic excuse for not fixing the problem.  As a friend once said about presentations — why is the presenter the only person in the room who can’t see the tweetstream? — why is the center referee the only person on the planet who can’t see the video replay?

Back to marketing, if FIFA won’t fix the problem, then over time I think Adam Smith will.  People will gradually lose interest in a sport that every day becomes more and more out of touch with both technology and consumer expectations.  Yes soccer has a huge worldwide audience today, but if such injustices continue, worldwide interest will erode over time, and in America, soccer — from an audience size perspective — will continue to be a C-tier sport.

Categories: Companies

Six Thoughts on The NoSQL Movement

Sat, 06/19/2010 - 02:48

We are in the middle of one of our periodic analyst tours at MarkLogic, where we meet about 50 top software industry analysts focused in areas like enterprise search, enterprise content management, and database management systems.  The NoSQL movement was one of four key topics we are covering, and while I’d expected some lively discussions about it, most of the time we have found ourselves educating people about NoSQL.

In this post, I’ll share the six key points we’re making about NoSQL on the tour.

Our first point is that NoSQL systems come in many flavors and it’s not just about key/value stores.  These flavors include:

  • Key/value stores (e.g., Hadoop)
  • Document databases (e.g., MarkLogic, CouchDB)
  • Graph databases (e.g., AllegroGraph)
  • Distributed caching systems (e.g., Memcached)

Our second point is that NoSQL is part of a broader trend in database systems:  specialization.  The jack-of-all-trades relational database (e.g., Oracle, DB2) works reasonably well for a broad range of applications — but it is a master of none.  For any specific application, you can design a specialized DBMS that will outperform Oracle by 10 to 1000 times.  Specialization represents, in aggregate, the biggest threat to the big-three DBMS oligopolists.  Examples of specialized DBMSs include:

  • Streambase, Skyler:  real-time stream processing
  • MarkLogic:  semi-structured data
  • Vertica, Greenplum:  mid-range data warehousing
  • Aster:  large-scale (aka “big data”) analytic data warehousing
  • VoltDB:  high volume transaction processing
  • MATLAB:  scientific data management

Our third point is that NoSQL is largely orthogonal to specialization.  There are specialized NoSQL databases (e.g., MarkLogic) and there are specialized SQL databases (e.g., Aster, Volt).  The only case where I think there are zero examples is general-purpose NoSQL systems.  While I’m sure many of the NoSQL crowd would argue that their systems can do everything, is anyone *really* going to run general ledger or opportunity management on Hadoop?   I don’t think so.

Our fourth point is that NoSQL isn’t about open source.  The software-wants-to-be-free crowd wants to build open source into the definition of NoSQL and I believe that is both incorrect and a mistake.  It’s incorrect because systems like MarkLogic (which uses an XML data model and XQuery) are indisputably NoSQL.  And it’s a mistake because technology movements should be about technology, not business models.  (The open source NoSQL gang can solve its problem simply by affiliating with both the NoSQL technology movement and the open source business model movements.)

As CEO of a company that’s invested a lot of energy in supporting standards, our fifth point was that, rather ironically, most open source NoSQL systems have proprietary interfaces.  People shouldn’t confuse “can access the source code” with “can write applications that call standard interfaces” and ergo can swap components easily.   If you take offense at the word proprietary, that’s fine.  You can call them unique instead.  But the point is an application written on Cassandra is not practically moved to Couch, regardless of whether you can access the source code both Couch and Cassandra.

Our sixth point is that we think MarkLogic provides a best-of-both-worlds option between open source NoSQL systems and traditional DBMSs.  Like open source NoSQL systems, MarkLogic provides shared-nothing clustering on inexpensive hardware, superior support for unstructured data, document-orientation, and high-performance.  But like traditional databases, MarkLogic speaks a high-level query language, implements industry standards, and is commercial-grade, supported software.  This means that customers can scale applications on inexpensive computers and storage, avoid the pains of normalization and joins, have systems that run fast, can be implemented by normal database programmers, and feel safe that their applications are built via a standard query language (XQuery) that is supported by scores of vendors.

Categories: Companies

Questioning the Tech Wunderkind Image

Sun, 06/13/2010 - 18:41

One of the things that irritates me about Silicon Valley culture is its blatant ageism.  I dislike it for several reasons:

  • Let’s start with the easy one:  it’s illegal.  As an employer you should be looking for someone qualified to do the job, not someone of a specific age.  While certain job requirements may end up setting a de facto lower bound on age (e.g., it’s hard to have a top MBA and 5 years of second-line management experience before you’re 30), age is not something you should talk about in the recruiting or management process.  People who would never say “let’s go find a Baptist to do this job” or “let’s go find a woman” will say things like “let’s go find a 32-year-old,” seemingly unaware it’s the exact same kind of discrimination.
  • In addition to over-promoting the whiz kids, the media almost never does any follow-up, telling us what became of the wunderkinds ten or twenty years later.  That’s why I was surprised to see this story in today’s New York Times, For A Mogul Money and Magic Have Limits, which details the dog’s breakfast whiz kid Halsey Minor has made of things since making a fortune off CNet during the Web 1.0 era.  Find the lessons in this quote:  “he thought he was a billionaire, spending far more than he had … but he really was a multi-millionaire always thinking I’m going to make the big score.”
  • The asymmetric media coverage gives people a distorted sense of reality:  (1) that they must start a company before they’re 30 or they never will, (2) that after 30 they are washed up, (3) that the odds of succeeding in a venture are way higher than they are, (4) that skills are more the determinants of success than luck, and (5) that youth/energy are more important than experience.
  • Point 4 is the Fooled by Randomness effect.  We don’t worship lottery winners, we just consider them lucky.  In business, we tend to equate financial success with skill and further sense that each idiosyncrasy is a cause of success.  If Google has free lunch, we’ll have free lunch.  If Steve Jobs wears jeans and a black t-shirt, then we should wear jeans and black t-shirts.  All notions of luck and causality get confused in the business media.
  • Regarding point 5, I’d like to ask the freshly-minted MBAs in my readership to ask themselves a question:  do you believe that you will be a better manager now or twenty years in the future when you still have the same degrees, the same intelligence, but twenty years of management experience?

But the thing that most amazes me about Silicon Valley ageism is that it’s often practiced by the 40+ crowd.  Here, neither self-interest nor logic prevail, just — I suspect — intellectual laziness.

Categories: Companies

Quick Take on the Dassault Systèmes Acquisition of Exalead

Wed, 06/09/2010 - 22:12

Today, in what I consider a surprising move, French PLM and CAD vendor Dassault Systèmes announced the acquisition of French enterprise search vendor Exalead for €135M or, according to my calculator, $161M.  Here is my quick take on the deal:

  • While I don’t have precise revenue figures, my guess is that Exalead was aiming at around $25M in 2010 revenues, putting the price/sales multiple at 6.4x current-year sales, which strikes me as pretty good given what I’m guessing is around a 25% growth rate.  (This source says $21M in software revenue, though the year is unclear and it’s not clear if software means software-license or software-related.  This source, which I view as quite reliable, says $22.7M in total revenue in 2009 and implies around 25% growth.  Wikipedia says €15.5M in 2008 revenues, which equals exactly $22.7M at the average exchange rate.  This French site says €12.5M in 2008 revenues.  The Qualis press release — presumably an excellent source — says €14M ($19.5M) in 2009 revenues.  Such is the nature of detective work.)
  • I am surprised that Dassault would be interested in search-based applications, Exalead’s latest focus.  While PLM vendors have always had an interest in content delivery and life-cycle documentation (e.g., a repair person entering feedback on documentation that directly feeds into future product requirements) , I’d think they want to buy a more enterprise techpubs / DITA vendor than a search vendor to do so as in the PTC / Arbortext deal of 2005.  Nevertheless, Dassault President and CEO Bernard Charlès said that with Exalead they could build “a new class of search-based applications for collaborative communities.”  There is more information, including a fairly cryptic video which purports to explain the deal, on a Dassault micro-site devoted to the Exalead acquisition, which ends with the phrase:  search-based applications for lifelike experience.  Your guess as to what that means is as good as mine.
  • A French investment firm called SCA Qualis owned 83% of Exalead steadily building up its position from 51% in 2005 to 83% in 2008, through successive rounds of €5M, €12M and €5M in 2005, 2006, and 2008 respectively.  This causes me to question the CrunchBase’s profile that Exalead had raised a total of $15.6M.  (You can see €22M since 2005 and the company was founded in 2000.  I’m guessing there was $40M to $50M invested in total, though some reports are making me think it’s twice that.)
  • The prior bullet suggests that Qualis took $133M of the sale price and everybody else split $27M, assuming there were no active liquidation preferences on the Qualis money.
  • Given the European-focus, the search-focus, and the best-and-brightest angle (Exalead had more than its share of impressive grandes écoles graduates), one wonders why Autonomy didn’t end up owning Exalead, as opposed to a PLM/CAD company.  My guess is Autonomy took a look, but the deal got too pricey for them because they are less interested in paying up for great technology and more interested in buying much larger revenue streams at much lower multiples.  In some sense, Autonomy’s presumed “pass” on this deal is more proof that they are no longer a technology company and instead a CA-like, Oracle-like financial consolidation play.  (By the way, there’s nothing wrong with being a financial play in my view; I just dislike pretending to be one thing when you’re actually another.)
  • One wonders what role, if any, the other French enterprise search vendor, Sinequa, played in this deal.  They, too, have some great talent from France’s famed Ecole Polytechnique, and presumably some nice technology to go along with it.

Here are some links to other coverage of the deal

Categories: Companies

Marketing Abuse: The Word “Partnership”

Wed, 06/09/2010 - 20:23

Dear Marketer:

I get about 5 of these emails a day.

Subject:  Partnership Proposal-Damco Inc.

Dear Dave,

Hope you are doing great.

Damco has vast experience in providing high quality and cost effective data processing services to its clients globally. Since its inception in 1996, Damco has honed its level of expertise and built robust processes and methodologies ensuring quick turnaround times, confidentiality and data security. Damco’s offshore delivery centres are ISO 9001:2000 and CMMI Level 3 certified and in addition we are fully compliant with BS7799 security standards and Data Protection Act 1998.

Damco has already delivered its data processing services to leading organizations in various industries including – Publishers, Libraries, Law Firms, Insurance Companies, Credit Card Companies, Market Research Companies, Healthcare Providers, Universities, Hospitality, Airlines, Banks, Registration companies, Government.

Highlights of our offerings are:

a) Up to 50% Cost Saving from Outsourcing
b) Domain Experience & Technical Expertise
c) High Quality standards in accordance with ISO 9001:2000
d) Well defined processes and methodologies
e) Data Protection, Confidentiality and Service Level Agreements
f)  State-of-the-art Communication Facilities

[Next 5 paragraphs omitted]

I have many objections to these emails, which typically come from off-shoring companies.  Let’s share some lessons about what’s wrong with them.

  • First, they are deceptive.  They are not about “partnership” (unless of course you define partnership as “I give you money” and you give me offshore developers, which I don’t).
  • They start business relationship based on a lie.  Credibility should be the top priority for the marketing department.  With these mails you first get my attention and then immediately destroy your credibility — the equivalent of expending great energy to shout:  I’M DAVE AND I SUCK.  (Why say anything at all?)  I know very little about Dacom or Damco or whoever they are, but I do know one thing:  they are willing to send misleading emails to increase lead conversion rates and therefore I want nothing whatsoever to do with them.
  • They bury me in useless facts that neither differentiate the offerings nor make me interested in doing business with the company:  they mails are– quite literally — all the same.  Everyone is CMMI this and ISO that.
  • They are mis-leveled.   They go to the trouble of renting a CEO mailing list and then write copy is neither CEO-level nor designed for the #2 thing CEOs do with email:  forward them to a direct report. (The #1 thing is delete and junk-list the sender.)  Done correctly, the starting copy would be written to make me want to forward the mail to my VP of Engineering and the rest of the copy would be written for him.

You could preserve your credibility and try to find a more strategic marketing angle with a subject like:

  • Outsourcing:  Five Things You Didn’t Know
  • Finally, Something Different in an Outsourcing Vendor
  • Yet Another Outsourcing Mail, Not.  Three Reasons Acme’s Different

Or, apply some of Porter’s generic strategies and head along one of two primary dimensions:

  • Outsourcing At Rock-Bottom Cost, Here’s How We Can Do It (cost leadership)
  • How Thing X Makes Vendor Y Unique in Outsourcing (differentiation)

But no matter your chosen angle, Dear Marketer, please remember this:  do not start a business relationship with a lie.

The relationship will last only as long as it takes to hit “junk sender” and you will be permanently muted thereafter.

Cheers,

Dave

Categories: Companies

MarkLogic and the Warrior Gateway Profiled on NBC News

Wed, 06/02/2010 - 00:54

MarkLogic and its customer and partner the Warrior Gateway were profiled yesterday on this NBC news story (video below and brief write-up here).

We’re proud to work with the team on Warrior Gateway to help build a site that I view as Kayak plus Yelp all rolled into one, aimed at the specific needs of veterans and their families.  Here are some statistics that might surprise you:

  • There are 25,000,000 veterans living in the USA today
  • This number expands by approximately 280,000 each year
  • The average age of a transitioning soldier is 25 years old
  • 38,000 veterans have been wounded in OIF and OEF
  • As many as 1 in 4 veterans from Iraq and Afghanistan have been diagnosed with mental disorders
  • As many as 1 in 5 have been diagnosed with symptoms of brain injury
  • Suicide rates among veterans are 4x the national average, among male veterans aged 20-24
  • One in two homeless are Iraq or Afghanistan veterans

For more information on the Warrior Gateway, visit the about page here, the site here, their blog here, or the page about the project from its sponsoring organization,  Business Executives for National Security, here.

I’ve embedded the video of the news segment below.

The Associated Press published a story about the Warrior Gateway here.

Categories: Companies

David Worlock on Super-Distribution

Mon, 05/24/2010 - 18:34

Just a quick post to share the slides from avuncular publishing industry guru David Worlock who presented at a publishing/media breakfast briefing that MarkLogic sponsored in the UK a few weeks back.  David blogged about a question he received during the briefing in this post, One Last Squeeze of the Lemon.  I’ve embedded David’s slides below.  You can download them from his site, here.

Superdistribution by David Worlock View more presentations from Dave Kellogg.
Categories: Companies

Thoughts on the SAP Acquisition of Sybase: In Search of Credibility

Thu, 05/13/2010 - 15:40

Yesterday SAP announced that it was acquiring database and mobile provider Sybase for $65/share, or approx $5.8B in total, a 44% premium over Sybase’s trailing three-month average stock price and a 56% premium over Tuesday’s closing price.  Here are some quick thoughts on the deal.

  • SAP has been trying to figure out a way to get arch-rival Oracle out from underneath the majority of its deployments for about a decade.  For example, they did a partnership with runner-up DBMS provider SoftwareAG to create MaxDB.  Recently Hasso Plattner has been working with the Hasso Plattner Institute on a in-memory, column-oriented database.  He presented a paper at SIGMOD on this work (which I’ll call HassoDB) and recently did a bizzare-ish video called Hasso on Hasso where he interviews himself discussing the project.
  • SAP’s efforts thus far have lacked credibility.  No serious Oracle shop would consider moving to MaxDB.  Adabas is seen as a C-tier relational database provider in an oligopoly-dominated market (i.e., Oracle, IBM, Microsoft).  Nor, to my knowledge, has Hasso’s work been taken seriously by the academic community; friends I know who attended the SIGMOD where he presented — and I’ll be nice for a change — said the paper was not particularly well received.
  • SAP has a database problem.  That’s clear.  And I think buying Sybase was probably the best way out of it.  The price, at 4.8x TTM sales seems high as does the 50%-ish premium.  But then again, SAP didn’t have any real alternative if it wanted to buy size and credibility in the relational database market.  The only other $1Bish company in the space is Teradata and they are data warehouse oriented.  SAP presumably wants a data warehouse DBMS, but they need an OLTP DBMS as well.  With its wide portfolio of DBMSs (e.g., column-oriented, in-memory, mobile, OLTP), Sybase fits the bill nicely.  And that’s not to mention its Sybase 365 mobile services which position it well in mobile analytics.
  • The acquisition seems pretty controversial.  One banker I spoke to yesterday thought it was a terrible idea.  I coincidentally spoke to some top DBMS industry analysts yesterday and they liked it.  My analysis is simple:  once SAP finally decided to solve their database problem — which, yes, they should have solved years ago — what other option did they have?  Among the options obvious to me (e.g, partnering with IBM to leverage DB2, trying to commercialize HassoDB, buying Software AG, buying Teradata), this was the best one.  The question isn’t how did they get themselves into this difficult situation and why were they asleep when Oracle consolidated a huge chunk of the enterprise software industry?  The question is what should they do about it, right now? Sybase seems a reasonable choice, infinitely preferable to what I thought they were going to do:  a quixotic attempt a turning HassoDB into a real competitor.   (Presumably, the Sybase unit will now get that task and the odds of success go up by about 100x in handing it over.)
  • I can’t help but mention the irony here.  SAP was a key reason that Sybase ended up a B-tier DBMS.  In the 1990s, when ERP application sales became a major driver for RDBMS purchases, Sybase lacked row-level locking which SAP required.   While Sybase played a leading role in the OLTP phase of the RDBMS market, they were locked out of the party when ERP-driven phase hit.  While Sybase eventually fixed the row-level locking issue (which was one of many knives also stuck in Ingres), it was too late.
  • If SAP thinks this is going to be another easy Business Objects style integration they are wrong.   Business Objects was naturally synergistic to SAP’s product offering and BI tools remained at the top of CIO priority lists around the time of the SAP / Business Objects deal.  So while I think the Sybase deal is a good strategic move, I think it’s going to be about 10x harder to sell Sybase to SAP customers than it was to sell BusinessObjects.  Last time, they were selling a complementary add-on product in a hot category; this time they’re asking customers to tear up railroad tracks.
  • Is this deal about credibility?  Yes.  Does buying Sybase give SAP a lot of DBMS credibility?  No.  Sybase’s market share is less than 5%, but they have a nice portfolio of DBMS technologies on which SAP can build.  Of SAP’s available alternatives, does this deal get SAP the most credibility?  In my estimation, yes.
  • Finally, a quick note on HassoDB.   The basic idea is that column-oriented databases go fast for data warehousing because data warehouse queries typically aggregate detail in columns.  Ergo, column orientation increases information density for these types of queries.  The problem is that column orientation is a disaster or OLTP operations because what would have been one simple insert/update gets split across N columns.  The solution to this problem, argues Hasso, is to put the whole thing in memory, ergo preserving the benefits of column-orientation while eliminating the drawbacks.
Categories: Companies

The Information Continuum and the Three Types of Subtly Semi-Structured Information

Tue, 05/11/2010 - 19:24

We generally refer to MarkLogic Server as an XML server, which is a special-purpose database management system (DBMS) for unstructured information.  This often sparks debate about the term “unstructured” and the information continuum in general.  Surprisingly, while both analysts and vendors frequently discuss the concept, the Wikipedia entry for information continuum is weak, and I couldn’t easily find a nice picture of it, so I decided to make my own.

The general idea that information spans a continuum with regard to structure is pretty much undisputed.  The placement of any given type of information on that continuum is more problematic.  While it seems clear the purchase orders are highly structured and that free text is not, the placement of, for example, email is more interesting.  Some might argue that email is unstructured.  In fact, only the body of an email is unstructured and there is plenty of metadata (e.g., from, send-to, date, subject) wrapping an email.  In addition, an email’s body actually does have latent structure — while it may not be explicit, you typically have a salutation followed by numerous paragraphs of text, a sign-off, a signature, and perhaps a legal footer.  Email is unquestionably semi-structured.

In fact, I believe that the vast majority of information is semi-structured.  PowerPoint decks have slides, slides have titles and bullets.  Contracts are typically word documents, but have more-or-less standard sections.  Proposals are usually Word or PowerPoint documents that tend to have similar structures.  Even the humble tweet is semi-structured:  while the contents are ostensibly 140 unstructured characters, the anatomy of a tweet reveals lots of metadata (e.g., location) and even the contents contain some structural information (e.g,. RT indicating re-tweet or #hashtags serving as topical metadata).

New let’s consider XML content.  Some would argue that XML is definitionally structured.  But I’d say that an arbitrary set of documents all stored within <document> and </document> tags is only faux structured; it appears structured because it’s XML, but the XML is just used as a container.  A corpus of twenty 2,000-page medical textbooks in 6 different schemas is indeed structured, but not well so.  To paraphrase an old saw about standards:  the nice thing about structures is that there are so many to choose from.  I believe that knowing content is marked up in XML reveals nothing about its structure, i.e., that XML-ness and structure are orthogonal.  Put differently, XML is simply a means of representing information.  The information represented may be highly structured (e.g., 100 purchase orders all in perfect adherence to a given schema) or highly unstructured (e.g., 20 documents only vaguely complying with 20 different schemas).

I have two primary beliefs about the information continuum:

  • The vast majority of information is semi-structured. There is relatively little highly structured and relatively little completely unstructured information out there.  Most information lies somewhere in the fat middle.  I overlaid a bell curve on top of the information continuum to reflect volume.
  • Even information that initially appears structured is often semi-structured.  I see three types of this subtly semi-structured information which, hopefully without being too cute, I’ll abbreviate as SSSI.  The three types are (1) schema as aspiration, (2)  time-varying schema, and (3) unknowable schema.

Let’s look at each of the three types more closely.

Schema as Aspiration

The first type of subtly semi-structured information (SSSI) is where a schema exists, but only notionally.  The schema itself is either poorly defined (actual quote:  “it is believed that this element is used for”) or well defined but not followed.  This is frequently the case with publishing and media companies.  Here are two free jokes that work well at any publishing conference:

  • Raise your hand if you have a standard schema.  Keep it up if your content actually adheres to it.
  • Oxymorons aside, how many of you have 3 or more “standard” schemas, 5 or more, … do  I hear 10?

These jokes are funny because of the state of the content.  This state is the result of two primary business trends:  (1) consolidation — most large publishers have been built through M&A thus inheriting numerous different standards, each of which may be only partly implemented — and (2) licensing — publishers frequently license content from numerous other sources, each with its own standard format.

Time-Varying Schema

The second case of SSSI is you where you have a well defined, enforced schema at any moment in time, but it keeps changing over time.  Typically this happens for one of two reasons:

  • The business reality that you’re modeling is changing.  For example, in 2009 Federal Sales was part of Eastern Sales but in 2010 it becomes its own division.  This makes comparison of Eastern results between 2009 and 2010 potentially difficult.  In BI circles, this is known as the slow-changing dimension problem.
  • Standards keep changing.  If you’re modeling information in a corporate- or industry-standard schema and that schema is changing, then your information becomes semi-structured because it is contained within multiple different schemas.  Sometimes you can avoid this by migrating all prior information to the current schema, but sometimes (e.g., massive data volumes, regulatory desire to not change existing records) you will not.

When viewed with a flash camera this information looks well structured.  When you look at the movie, you can clearly see that it’s not.

Unknowable Schema

The last case of SSSI is where you have an unknowable schema.  Consider terrorist tracking.  If you were to make a schema for a terrorist database, here are some of the attributes that spring to mind:  name, alias(es), address, former address(es), height, weight, hair color, eye color, member-of, enemy-of, friend-of, tattoos/markings.

Here are some problems with this:

  • Many of the attributes are multi-valued, such as alias or friend-of.  In a de-normalized approach, this means dealing with repeating group problems and creating N columns (e.g., alias, alias1, alias2, and up to the maximum number of aliases for any terrorist).  Normalization would take care of the repeating group but at the cost of creating a table for each multi-valued attribute and then having to join back to those tables when you run queries.  (One such real system ended up with 500 tables, with the result that no one could find anything.)
  • It is difficult to create a type for the tattoo attribute.  First, it’s multi-valued.  Second, while tattoos are sometimes images, they often contain text (e.g., Mom) and sometimes in a foreign language (e.g., 愛, the Chinese symbol for love).  Since you’re trying to secure the nation against threat you don’t want to throw away any potentially valuable information, but it’s not obvious how to store this.
  • New attributes are coming all the time.  Say you get a shoe print on a suspect as he runs away.  You need to add a shoe-size attribute to the database.  Say a terrorist runs away and leaves a pair of eyeglasses.  Now we need to add eyeglass prescription.  My favorite is what’s called pocket litter.  You find a piece of paper in a person’s pocket and it has a number on it.  It could be a phone number, a  lock combination, or maybe map coordinates.  You don’t know what it is — but again, since you don’t want to throw any potentially valuable information — you have to find a place to store it.
  • Combining an enormous number of potential attributes with the reality that very few are known for most individuals creates two problems:  (1) you end up with a sparse table which is not well handled in most RDBMSs and (2) you end up hitting column limits.

Another example of unknowable schemas would be in financial services, modeling derivatives.   Because derivatives are sometimes long-lived instruments (e.g., 30 years) you may face the time-varying schema problem.  In addition, you have the unknowable schema problem because the industry is constantly creating new products.  First we had CDOs and CDSs on banks, then single-tranche CDOs, then CDSs on single-tranche CDOs, and then synthetic CDOs.  If this makes your head hurt in terms of understanding, then think for a minute about data modeling.  How are you going to store these complex products in a database?   And what are you going to do with the never-ending stream of new ones — last I heard they were considering selling derivatives on movies.

(As it turns out XML is a great way to model both these problems as you can easily add new attributes on the fly and only provide values for attributes where you know them.)

To finish the post, I’ll revisit the statement I started with:  we generally refer to MarkLogic Server as an XML server, a special-purpose database management system (DBMS) for unstructured information.  Going forward, I think I’ll keep saying that because it’s simpler, but at the MarkLogic 201 level, the more precise statement is:  a special-purpose DBMS for semi-structured information.

There’s way more semi-structured information out there.  Realizing that information is semi-structured is sometimes subtle.  And semi-structured information is, in fact, the optimization point for our product.  So what’s MarkLogic in three concepts?  Speed, scale, and semi-structured information.

Categories: Companies

Save The Date: MarkLogic 2011 User Conference

Fri, 05/07/2010 - 18:08

Now that we’re done with the stellar Mark Logic 2010 User Conference, it’s time to ask everyone to save the date for the MarkLogic 2011 User Conference which will be held at the wonderful Palace Hotel in San Francisco on 4/26 – 4/29/2011.

Our current plan is to have pre-conference training on Tuesday 4/26.  The conference will be 2.5 days, starting Wednesday 4/27 in morning and going through Friday 4/29/2011.  On the afternoon of Friday 4/29, we’ll have a supplemental session for MarkLogic partners.  (Note that this format slides the conference back one day in terms of what day-of-the-week we do things; we’ve done that because 4/24 is Easter and we don’t want people to have to choose between traveling on Easter and missing the pre-conference training.)

PLEASE MARK THESE DATES NOW IN YOUR CALENDAR.  You can always clear them later if you don’t want to come.

Note that our marketing head, Tracy Eiler, and I are wondering if it’s time to give the conference a name, so I’d be open to any suggestions.  To get your creative juices flowing, here are some other companies and how they’ve named their user conferences.

  • SAP:  Saphire
  • Oracle:  Oracle Open World
  • Endeca:  Discover
  • Facebook:  F8
  • Twitter:  Chirp
  • Informatica:  Informatica World
  • Autonomy:  Inorganic Growth World (formerly, Bayesian BlackBox World)*

I won’t bias the brainstorming by starting with my own ideas.  Feel free to comment with suggestions or email me at ceo at marklogic dot com.

* Yes, the Autonomy ones aren’t real; I was just sniping.  On a serious note, I was surprised to find that Autonomy doesn’t seem to have a user conference, and I’ll let the reader conclude on his/her own the interpretation of such.  (Mine isn’t good.)

Categories: Companies

Norm Walsh XML Rock Star Video

Fri, 05/07/2010 - 17:21

One MarkLogic tradition that I borrowed from Business Objects is the kickoff internal video contest.  You create a contest where any group or person can submit a short video on any topic they’d like … and then see what happens.  As an employee, I like these contests because the videos are typically quite funny.  As CEO, I like them because they reveal organizational culture and pathology.

While at Mark Logic I’ve yet to find much pathology, at Business Objects — since we were both bigger and had a higher level of traditional conflict (e.g., field vs. corporate, USA vs. France, sales vs. engineering) — the submissions often provided a clear window into key problems afflicting the corporate soul.  So I always look at these the videos through two lenses:  an employee one and a managerial one.

Once in a while you get an entry that is truly outstanding:  one that is funny, well produced (e.g., made by someone who actually studied film), and revealing of culture.  Such is the following entry which features Norm Walsh (who in real life, despite XML guru status, is as low ego and nice a guy as you’re likely to meet — he had to be cajoled into doing this) as an XML rockstar.  It was made by our media consulting organization and produced/directed by Paxton Hare.  It’s worth watching multiple times because there is a lot of great detail.

If you like Paxton’s work, he’s made a few films over the years and you can get more information here.

Categories: Companies

Introducing the New MarkLogic Logo

Tue, 05/04/2010 - 19:37

After nearly six years at MarkLogic, we’ve decided to do a light refresh on our corporate identity, changing a few things

  • The company name.  We changed this from Mark Logic (with a space) to MarkLogic with no space.  Previously, we had a subtle distinction where MarkLogic meant product and Mark Logic meant company.
  • The logo.  The intent here was to promote the “Logic” in the logo.  The previous branding strategy was predicated on people calling the company Mark.  Hence the old logo had a big Mark and the little Logic.   In reality, people call us MarkLogic and we wanted a logo that reflected that.
  • The color.  Marketing SVP Tracy Eiler brightened up the official MarkLogic red as a part of this process.

We also have new business cards and a new PowerPoint template, which I used for my keynote speech this morning.  Here’s the new logo:

Categories: Companies

Slides from my Presentation at the Mark Logic 2010 User Conference

Tue, 05/04/2010 - 19:17

Just a quick post to share the slides that I presented this morning at the Mark Logic 2010 User Conference.

Kellogg Mark Logic 2010 User Conference View more presentations from Dave Kellogg.
Categories: Companies

Foursquare, Location Broadcasting, and Trust

Sun, 05/02/2010 - 01:20

I wrote a post last week sharing my first impressions of Foursquare, the much-hyped new Internet service that combines geolocation with social networking and elements of gaming.  Just as Facebook introduced core new concepts of “friends” and “status updates,” there is one core new concept in Foursquare: “checking in.”  When you arrive somewhere, you check-in to the location, announce to your Foursquare friends that you are there, and can optionally send a check-in announcement to your Facebook status feed.

Checking in is a useful concept.  A GPS can’t tell if you are in the Tanuki Tavern on the ground floor of the Ganesvoort Hotel or the Plunge bar on top.  (To my knowledge, GPSs don’t do altitude.)  Nor, with 6 or so meter precision, can a GPS nail whether you are in The Brick Lane Curry House, Zerza, or Taj, all tucked tightly in a row on East 6th Street.

So I like the idea checking in, despite that some folks are already complaining of check-in fatigue.

But checking-in also gave me the seemingly false impression that I could control when I broadcast my location to the world.  I figured, if I wanted people to know where I was, I could check-in.  And conversely, if I didn’t want people to know my location, then I wouldn’t.  For example, if I were in Lake Tahoe in December, it would seem obvious that I was on vacation, and thus robbers could know I was away.  Thus, I wouldn’t do it.  (A website, PleaseRobMe, has already been setup to highlight these dangers to people.)

But last week I began to suspect that Foursquare was constantly beaming my location from my phone back to Foursquare central and posting it on the web.  I did a few experiments during the past week to verify this and it appears to be true.  Whether you check-in or not, Foursquare tells not only your friends — but seemingly the whole Internet — where you are.

If true, I view this as a major trust violation; nowhere during the registration process did they make this not-so-little detail evident.  In fact, I wouldn’t have noticed it if I’d not logged into Foursquare on the web which I’m guessing many mobile users never do.  And the whole notion of checking-in seems to be odds with the notion of constant location-broadcasting.  So this was quite bad for my trust in Foursquare.  (And, as an aside, boy was last week was a bad week for trust on the Internet.)

Foursquare needs to do something about this.  They need to:

  • Make it clear during registration that this automatic location-beaming is going to happen
  • Make it controllable:  friends-only, friends-of-friends, etc.
  • Make it disable-able:  so you can turn it off if you so desire
Categories: Companies

Quick Take on Foursquare

Tue, 04/27/2010 - 21:50

I’ve heard so much hype about Foursquare that I had to give it a try, so I downloaded the Blackberry App and have been playing it with for a few days.  Here are my quick initial impressions:

  • I give them an A+ for what I call “minimal implementation of core concept.”  Lots of startups have a core concept that needs rounding-out over time and they get confused about out how much core vs. how much rounding-out they should do in the first release.  I believe that for breakthrough products/services your first release should be all core, little round-out.  Foursquare implements this philosophy well:  this is about friends and their locations, period.  The app can’t even help you edit your profile picture (e.g., mine got all distorted), so I had to edit it myself on my PC and then re-upload it.  But that’s perfect.  Having a nice picture upload, crop, and edit function is precisely what you don’t want a location-based services startup focusing on.
  • I initially wondered why you’d need Foursquare.  After all, if I want to find my friends I can theoretically do that through Facebook status updates already. For I example, I can status “eating lunch at the Red Eye Grill” and my friends can notice that I’m there.  The problem is, of course, in an information-overloaded world of tweets, Facebook statuses, LinkedIn statuses, and other social network exhaust, these where-I-am status updates are easily lost amid the update flotsam.  What’s more, having the text “Red Eye Grill” and knowing that its a restaurant in Manhattan at 890 7th Ave which is at (40.76506, -73.98031) are two different things.  The former is pretty much useless without the additional of human context, the latter can be geo-searched, mapped, etc.
  • Because Foursquare knows where you are from your phone’s GPS and because you can check in to places you go, Foursquare can very easily determine who among your friends are at the same venue as you (think:  a big crowded nightclub) or simply who is nearby.  I can envision good uses for that.
  • Frankly, I don’t get the whole “mayor” thing — i.e., given users can become the mayor of a venue.  For example, Jason M is the mayor of local pub Pudley’s as well as 12 other venues.  Mark Logic engineer Pete A. has beaten me to becoming the mayor of Mark Logic and I don’t know how to unseat him.  Right now, I view this — like badges — as harmless fun and silliness.
  • I find the Blackberry application slow.
  • In the privacy department, this one creeps me out:  I think the Foursquare application on my phone might be periodically beaming my GPS coordinates to the Foursquare central without me knowing about it.  For the first few days, I assumed that Foursquare only knew where I was when I checked-in somewhere.  Now, when I go to my profile page, it seems to always know where I am.  I’ll watch this more closely and then give an update.  Note that one common criticism against location-based social networks is the PleaseRobMe problem, and I wonder if this is accompanied by a PleaseKillMyBattery problem.
  • I like the name Foursquare, though my kids had to tell me it was also the name of a playground game.
Categories: Companies

In Facebook I Don’t Trust

Sun, 04/25/2010 - 17:35

I must admit I’ve always had some degree of trouble “getting” certain aspects of Facebook, particularly when it comes to privacy and money.  This post is about privacy.

The good news is I think I always got the value of the service itself.  I remember when I first saw it, I said:  ah, a friend-aware roll-up of Digg, Twitter, Flickr, YouTube, LinkedIn, and more.  I can post photos, videos, status updates, links, et cetera, all in such a way that I’m guaranteed to get some privacy.  After all, if I wanted the whole Internet see my photos or status updates, I could just use Flickr or Twitter.  But if I only wanted my friends (or some permutation thereof) to see them, then I could use Facebook.  That’s why I always liked their generic homepage copy:  Facebook helps you connect and share with the people in your life.

I remember thinking a few other things as well:

  • Classmates is toast.  First, I don’t need a special-purpose social network for re-connecting with my old high school buddies; I can use the same one I use for the rest of my friends.  Second, presupposing that our kids will lose connection with their friends and ergo need such a re-connection service is flawed.   We are a transition generation.  We get to experience the novelty of re-connecting with friends from 25 years ago.  I assume my kids will never lose such connections in the first place.
  • I don’t want to make my friend graph ten times.  This suggests need for an “open graph,” but I figured — vendor machinations aside –  that was never going to happen because the social graph was in fact the crown jewel of the social network.
  • How many different graphs do I have or need?  I have a colleague/work-friend graph that I’ve already entered in LinkedIn.  With Facebook, I’m going to have a personal-friend graph.  Do I need a third?   My personal answer is no.   I can imagine folks who want to avoid “when social networking worlds collide” problems — e.g., having your LinkedIn friends exposed to your AdultFriendFinder swingers.  I figured in those cases, people would most certainly go to the trouble of making a separate graph.  (And, by the way, a graph that presumably has little overlap with the work-friend or personal-friend one.)
  • Is Ning going to make it?  Ning seemed to want to solve multiple-graph problem by making a common platform for social networks, so I could have one profile and one graph for my stamp collecting (fictitious), Business Objects alumni, and other networks.  The idea was logical in one sense — certainly more logical than assuming any graph-owner would open up the graph.  But I wondered how many people wouldn’t be covered through a combination of Facebook and LinkedIn.  My gut answer remains “few.”  Personally, I’ve been invited to one Ning network for Business Objects alums which pretty much died off.   I’ve never joined another.  (And Ning recently seems to have hit troubled waters.)

After using FriendFeed for a while, I realized that it was working at cross-purposes to my desire to divide my social networking worlds between work (LinkedIn) and personal (Facebook).  Yes, I’d be happy to share my Yelp restaurant reviews with my Facebook friends, but I didn’t particularly want to share them with customers.  Nor did I wish to work spam my friend-friends with news about MarkLogic, XML databases, and the NoSQL movement.  So, I cut the cord between my two primary social networking worlds.

Until recently, I thought I was done.  But Facebook keeps messing around with privacy, most recently trying to setup auto-sharing (“instant personalization“) across a bunch of socially-related sites that I don’t want.  Before that they re-defaulted all my privacy settings to a state where they basically assumed the whole Internet was my “friend.”  Before that, they had the Beacon fiasco.  I get the Twitter jealously.  But what Facebook doesn’t get is one word:  friend.

If I wanted my personal profile public, I’d put it on my blog.  If I wanted my Facebook status updates public, I’d Tweet them.  If I wanted my photos public, I’d put them on Flickr.  If I wanted my Facebook friends to see my Yelp reviews, I’d setup FriendFeed.  Facebook:  I don’t need you re-setting my privacy all the time because you’re (1) jealous of Twitter and (2) quick to forget the point of Facebook, which is friends.

Unless Facebook figures out its core value and stops repeatedly compromising privacy, they are going to create an opportunity for someone else who does.  Their moves are already backfiring:  the search query “how to set Facebook privacy settings” returns 11M hits. 

Since I no longer trust Facebook, I no longer trust their default settings, so I spent about 20 minutes the other day setting every privacy parameter I could find to friends-only.  Before that, I had many things set to friends-of-friends and friends-and-networks.  No longer. Congratulations, Facebook:  in your desire to open up all my information, you only caused me to nail it down tighter than it was before.

While I applaud your success, your ability to make money, and the tremendous valuation you carry on the secondary markets, you need to stay in touch with the core reason why people use your service:  the friend graph.   You have the crown jewel.  Stop abusing it.

Categories: Companies