Skip to content

Feed aggregator

Hadoop Weaknesses and Where Teradata Aster Sees the Big Data Money

myNoSQL - Alex Popescu - 5 hours 10 min ago
Hadoop Weaknesses and Where Teradata Aster Sees the Big Data Money:

An interesting post on Teradata Aster blog which is indirectly emphasizing the weaknesses of the Hadoop platform:

  1. Make platform and tools to be easier to use to manage and curate data. Otherwise, garbage in = garbage out, and you will get garbage analytics.
  2. Provide rich analytics functions out of the box. Each line of programming cuts your reachable audience by 50%.
  3. Provide tools to update or delete data. Otherwise, data consistency will drift away from truth as history accumulates.
  4. Provide applications to leverage data and find answers relevant to business. Otherwise the cost of DIY applications is too high to influence business – and won’t be done.

It’s difficult to argue against these points, but they are not insurmountable. I’d even say that once the operational complexity of Hadoop deployments will get simpler—I think the Apache community, Cloudera, and Hortonworks are already working on these aspects—, Hadoop will see even more adoption and with that contributions addressing points 2 to 4 will follow shortly.

Yet another interesting part of the post is the two “equations” describing the two environments:

big clusters = big administration = big programs = big friction = low influence (Hadoop)
big data = small clusters = easy administration = big analytics = big influence (ideal/Teradata Aster)

I think these are revealing how Teradata Aster is positioning their solutions and where they see themselves making money in the Big Data market. It goes like this: “we can make a lot of money if we offer a platform with lower complexity and operational costs and higher productivity leading to better business results”. This is a sound strategy and the competitors from the Hadoop space should better focus on these same aspects which are essential to wide adoption.

Original title and link: Hadoop Weaknesses and Where Teradata Aster Sees the Big Data Money (NoSQL database©myNoSQL)

Categories: Blogs, NoSQL

The Fantastic 12 of 2012: Behind the Scenes of Managed Self-Service BI

We’re back with a new episode of The Fantastic 12 of 2012: Behind the Scenes Blog Series, where we’re providing unique insights from the SQL Server Engineering Team as they developed SQL Server 2012. This week we are jumping ahead to Number 6 of The Fantastic 12 of SQL Server 2012 (PDF) and we’ll revisit Number 5 in the weeks ahead. Be sure to catch the first four episodes here.

In this new episode John Hancock, Principal Program Manager, provides some interesting insights behind a major design decision around the new modeling capabilities including Key Performance Indicators, Hierarchies, and Perspectives and determining where those new capabilities ought to go into SQL Server 2012. Should those capabilities go into the professional environment or is there another approach? Find out how the team addressed and solved this challenge in the episode below!

Don’t forget The Fantastic 12 of #SQL2012 Twitter Contest is happening every Thursday at 10:30am PT, where we’re giving away the brand new SQL Server T-Shirts selected by the SQL Family

 

Fantastic 12 of SQL Server 2012

6 Managed Self-Service BI

Gain insight and oversight

  • PowerPivot for SharePoint: Balance the need to monitor, manage, and govern the data and analytics end users create with IT dashboards and controls that help IT monitor end user activity, data source usage, and gather performance metrics from servers.

Enable IT Efficiency

  • End user created, IT managed: SQL Server 2012 bridges the gap between end user created BI applications and IT managed corporate solutions by providing the ability to import PowerPivot models into Analysis Services so that they can be professionally managed and transformed into corporate grade solutions.
  • Ease of administration through SharePoint: Enable end user alerting from reports published to SharePoint and benefit from the ease of consolidated management through the SharePoint 2010 Central Administration.
  • SQL Azure Reporting: Extend rich user insights to even more people with SQL Azure Reporting that removes the need for deploying and maintaining a reporting infrastructure.
Categories: Companies, SQL Server

Big Data Episode 1: Overview for the Boss

Oracle Database Insider Blog - 10 hours 31 min ago

The boss needs to present a big data strategy to the CEO. But what's it all about? And above all, what's the value to the business? In this video two team members give him an overview and will then get to work filling in the details.

Categories: Companies, Oracle

May 23 Live Webcast: Oracle Database Appliance Best Practices

Oracle Database Insider Blog - 10 hours 34 min ago

Simplify Database Management with Oracle Database Appliance Deployment Scenarios

Business users increasingly demand 24x7 availability of their data while IT departments face the challenge of ensuring maximum availability while operating with limited budgets.

By deploying Oracle Database Appliance, organizations can benefit from a reliable system that significantly reduces the time spent on routine system administration and maintenance, lowering operational costs, and allowing IT personnel to focus on higher value activities.

Using proven deployment best practices, midsize customers and enterprise departments alike can quickly integrate Oracle Database Appliance into their backup, test, development, and production environments. And since Oracle Database Appliance is based on IntelÂź XeonÂź processors, organizations can ensure a high level of performance and scalability.

Join Oracle Database Appliance experts Tammy Bednar, David Swanger, and Intel expert Fabrizio Giamello for this live Webcast and learn how to:

  • Achieve a high quality of service at the lowest cost
  • Reduce up-front investment in hardware and software
  • Implement best practices across a multitude of deployment scenarios

Register today and get answers to your questions live from the experts.

Categories: Companies, Oracle

Licensing: Could It Be Simpler?

myNoSQL - Alex Popescu - 11 hours 6 min ago

In case you think licensing it’s easy, read this post by Alex Gorbachev explaining how remote mirroring, backup, and cold failover come with their own licensing implications1. My thoughts went from “it’s probably me”, to “this can’t be true”, to “not only will you need an army of people to setup things, but also an army to understand what you need to pay for”.

By the way, I consider licensing as being an important part of the experience of a product. The more complicated it is, the less I feel like trying the product, even if feature-wise it comes close to my requirements.

  1. The post refers to Oracle licensing, but I’d venture to say that you could probably find the same simple licensing system in many other places. ↩

Original title and link: Licensing: Could It Be Simpler? (NoSQL database©myNoSQL)

Categories: Blogs, NoSQL

Iqbal Goralwalla Wows the Audience with DB2 9.7 Fix Pack “Pearls” on DB2Night Showℱ

Were you unfortunate enough to miss IBM Champion Iqbal Goralwalla talk about DB2 LUW 9.7 Fix Pack “Pearls” on the latest DB2Night Showℱ? It was a great show. Luckily replays are still available, so why not take time out to … Continue reading →
Categories: Companies, DB2

When using the Task Parallel Library, Wait() is a BAD warning sign

Ayende @ Rahien - 16 hours 2 min ago

Take a look at the following code:

public static Task ParseAsync(IPartialDataAccess source, IPartialDataAccess seed, Stream output, IEnumerable<RdcNeed> needList)
{
    return Task.Factory.StartNew(() =>
    {
        foreach (var item in needList)
        {
            switch (item.BlockType)
            {
                case RdcNeedType.Source:
                    source.CopyToAsync(output, Convert.ToInt64(item.FileOffset), Convert.ToInt64(item.BlockLength)).Wait();
                    break;
                case RdcNeedType.Seed:
                    seed.CopyToAsync(output, Convert.ToInt64(item.FileOffset), Convert.ToInt64(item.BlockLength)).Wait();
                    break;
                default:
                    throw new NotSupportedException();
            }
        }
    });
}
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }

Do you see the problem in here?

It is a result of a code review comment about improper use of async in a project. This resulted in a lot of Task showing up in the return methods, but not in any measurable improvement in the actual codebase use of asynchronicity.

The problem is that when you need to work with such things in C# 4.0, you have to do some annoying things to get the code to work properly. In particular, this method was modified to be:

public static Task ParseAsync(IPartialDataAccess source, IPartialDataAccess seed, Stream output, IList<RdcNeed> needList, int position = 0)
{
  if(position>= needList.Count)
  {
        return new CompletedTask();
  }
  var item = needList[position];
  Task task;
            
  switch (item.BlockType)
  {
        case RdcNeedType.Source:
            task = source.CopyToAsync(output, Convert.ToInt64(item.FileOffset), Convert.ToInt64(item.BlockLength));
            break;
        case RdcNeedType.Seed:
            task = seed.CopyToAsync(output, Convert.ToInt64(item.FileOffset), Convert.ToInt64(item.BlockLength));
            break;
        default:
            throw new NotSupportedException();
  }

  return task.ContinueWith(resultTask =>
    {
        if (resultTask.Status == TaskStatus.Faulted)
            resultTask.Wait(); // throws
        return ParseAsync(source, seed, output, needList, position + 1);
    }).Unwrap();
}
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }

This code is more complex, but it is actually making proper use of the TPL. We have changed the loop into a recursive function, so we can take advantage of ContinueWith to the next iteration of the loop.

And no, I can’t wait to get to C# 5.0 and have proper await work.

Categories: Blogs

Optimize storage with deep compression in DB2 10

Organizations are generating more data now than at any other time in history. And the need to comply with legal and governmental regulations means that they're keeping that data around for longer periods of time. As a result, databases are growing at an astonishing rate. In fact, according to industry analysts, enterprise databases are growing at the rate of 125 percent annually. This explosion in data volume places tremendous pressure on enterprises to store, protect, distribute, and derive value from all the data being generated. In 2007, IBM responded to this pressure by introducing new compression technology, known as Deep Compression, in DB2 Version 9. Since then, IBM has improved this technology in subsequent releases of DB2. This article describes the various compression methods that are available in DB2 Version 10.1, as well as provides recommended "best practices" that will help you achieve maximum storage space savings when you adopt any of the compression techniques available.
Categories: Companies, DB2

Get the most from the DB2 HADR standby database

DB2 High Availability Disaster Recovery (HADR) is an easy-to-use data replication feature that provides a high availability (HA) solution for both partial and complete site failures. Beginning with DB2 V9.7 Fix Pack 1, the standby database permits read access from user applications. This article explains how this capability can be used for read applications, and what the current limitations are. In addition, it includes suggestions for how you can utilize the potential of the standby database.
Categories: Companies, DB2

Get started with the IBM InfoSphere DataStage and QualityStage Operations Console Database, Part 1: An introduction

This article is a deep dive into the schema of the IBM InfoSphere DataStage and QualityStage Operations Database, and in particular the tables and columns that make up its key relationships. Specimen SQL queries are included to demonstrate how data can be read from these tables to answer specific operational questions. You can adapt these to build, for example, custom reports based on the operational data collected at your particular DataStage and QualityStage installation.
Categories: Companies, DB2

Have you registered for Kscope12?

Oracle Database Insider Blog - Wed, 05/16/2012 - 22:28

Join the Oracle Database Insider team at this year's ODTUG Kscope12 conference in San Antonio, Texas. June 24-28. Use discount code DBI (Database Insider) for $100 off!

ODTUG (Oracle Developer Tools User Group) holds their premier event for the Oracle Technical community annually at the Kscope event. ODTUG Kscope12 is the place to be for the Oracle technical community in 2012. If you are a developer, architect, technical lead, or database administrator who works with Application Express, Business Intelligence, Oracle EPM; including Hyperion products, Essbase, Planning; Database Development or Fusion Middleware, Kscope12 is where you should be. It's hard to find a conference that's big enough to attract world renowned speakers and small enough to get the chance to share knowledge. Kscope12 is that conference.

Oracle at Kscope12

Sessions/Speakers:
Oracle development experts and Oracle ACE Directors are featured speakers during the conference. In all, Oracle will participate in 54 sessions plus symposium sessions, and hands on lab sessions.

Exhibitor Showcase:
Oracle will have a 10x20 booth in the exhibit center, encourage your customers/partners to stop by and learn about Oracle in BI, database development, and many other tools. Meet and greet with experts and learn the latest news about Oracle technology and trends!

Networking Activities:
Something for everyone! Check out www.kscope12.com – and look for a host of networking and learning activities.And, you won't want to miss the special event planned for Wednesday, June 27 as Kscope12 participants leave the high tech world and go to Knibbe Ranch (pronounced ka-NIB-bee), an honest-to-goodness working ranch and the site of Kscope12's special event on Wednesday night.

Community Service Day:
Calling all volunteers!This year's Community Service Day will be dedicated to give back by painting and landscaping a clubhouse of the Boys and Girls Club of San Antonio. Please plan to arrive in time to leave at 8:00am on Saturday morning, June 23.Pay it forward by delivering something back to the community and have a great day with people from around the world!

Listen to the Podcast to learn more!

Get Ready for ODTUG's June 24-28 Kscope12!
ODTUG VP, Monty Latiolais gives an overview of what to expect at this year's Oracle Development Tools User Group Conference Kscope12 in San Antonio this June 24-28th.

Categories: Companies, Oracle

Ruby Firebird Extension Library – Fb bumped to version 0.7.0

Firebird News - Wed, 05/16/2012 - 22:18
With following changes: Make fb compatibile with Rubinius. Add encoding attribute and force strings from database to that encoding under Ruby 1.9.x. Revert to old object allocation method to prevent stack overflow. Satisfy ISO C90. Update:The Rubinius change had to be reverted to make MRI work.
Categories: Open Source

NO DB - the Center of Your Application Is Not the Database

myNoSQL - Alex Popescu - Wed, 05/16/2012 - 20:21
NO DB - the Center of Your Application Is Not the Database:

Uncle Bob:

The center of your application is not the database. Nor is it one or more of the frameworks you may be using. The center of your application are the use cases of your application. [
] If you get the database involved early, then it will warp your design. It’ll fight to gain control of the center, and once there it will hold onto the center like a scruffy terrier. You have to work hard to keep the database out of the center of your systems. You have to continuously say “No” to the temptation to get the database working early.

Original title and link: NO DB - the Center of Your Application Is Not the Database (NoSQL database©myNoSQL)

Categories: Blogs, NoSQL

Training in London next week

MySQL Performance Blog - Wed, 05/16/2012 - 17:24
I’m going to deliver MySQL Training next week (May 21-24) in London. This is a rare opportunity as I do not personally deliver a lot of Training, especially outside of US. There are still some places left if you want to sign up. You will also get a signed copy of High Performance MySQL 3rd [...]
Categories: Blogs, MySQL, Open Source

Benchmarking single-row insert performance on Amazon EC2

MySQL Performance Blog - Wed, 05/16/2012 - 16:55
I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, [...]
Categories: Blogs, MySQL, Open Source

The Grand Picture of Big Data and the Impact on the Architecture of Systems

myNoSQL - Alex Popescu - Wed, 05/16/2012 - 16:55

In a recent interview for AllThingsD, Mike Rhodin, the senior vice president of IBM’s Software Solutions Group gave a very realistic description of what the future of data looks like:

[
] it comes out of the digitization of the physical world, the instrumentation of physical processes that’s going to generate huge amounts of new data, which is going to drive issues around storage, and what to do with all the data, how to analyze it. That pushes you toward real-time analytics and streaming technologies, because with real time, you don’t have to save the data — you want to look for anomalies as they occur.

This is indeed the grand picture of Big Data.

Now think for a second how many companies have such systems in place. Not many. Think now how many companies can offer as-complete-as-possible integrated systems to address these challenges. Very few.

These two answers are revealing an interesting perspective about the future of the Big Data market.

On one side we have vendors building top notch solutions—consider the new features in the relational databases, NoSQL databases, Hadoop, etc. By looking at this space you’ll have to agree that all these are excellent solutions for tackling a sub-space of the overall problem. They are getting closer and closer to offering local optimum solutions.

On other side there are the system integrators and platform vendors. Their systems may not be the best in solving every aspect of a problem, but their focus is in addressing and solving the complete problem. Their sales pitch is integration and/or specialization.

As someone writing about polyglot persistence and the 1001 NoSQL, NewSQL, and the development of the relational databases, I could be tempted to think that every company would have the budget, the know-how, and the time to take top-notch sub-systems and create solutions crafted to their problem. But looking back in time and also applying the lessons from other markets, I think it is safe to say that integrated solutions are preferred.

The lesson to be learned by both NoSQL and relational database vendors, actually by all (sub)system vendors that are playing in the Big Data market is to design products with openness and integration in mind. Very few, if any, sub-systems will be part of the grand solution if they are architected as silos. They can continue to provide the ultimate local optimum solutions, but as long as they are not architected to be part of a collaborative integrated platform they’ll lose important segments of the market. Many products I’m writing about are already following this principle, many are making steps towards being friendlier in terms of integration, and many are still taking the silver bullet approach.

Original title and link: The Grand Picture of Big Data and the Impact on the Architecture of Systems (NoSQL database©myNoSQL)

Categories: Blogs, NoSQL

What Big Data Is Used for at Facebook

myNoSQL - Alex Popescu - Wed, 05/16/2012 - 15:43
What Big Data Is Used for at Facebook:

Just a couple of examples: product and brand engagement, advertising.

A recent study we just published in the Proceedings of the National Academy of Sciences tells a new story about the way people adopt products and engage with them. The prevailing theories about this process suggested that what influences a person [to] adopt technologies is the number or percentage of friends who have already adopted the same technology, along with a person’s threshold for adopting such technologies. Our study shows that it’s less about the number of your friends who are using the technology, but more about their diversity. [
] Some of the work we’re interested in understanding is how your friends influence your decisions to engage with advertising and brands.

Original title and link: What Big Data Is Used for at Facebook (NoSQL database©myNoSQL)

Categories: Blogs, NoSQL

Cassandra at Workware Systems: Data Model FTW

myNoSQL - Alex Popescu - Wed, 05/16/2012 - 15:34
Cassandra at Workware Systems: Data Model FTW:

One of the stories in which the deciding factor for using Cassandra was primarily the data model and not its scalability characteristics:

We started working with relational databases, and began building things primarily with PostgreSQL at first.  But dealing with the kind of data that we do, the data model just wasn’t appropriate. We started with Cassandra in the beginning to solve one problem: we needed to persist large vector data that was updated frequently from many different sources. RDBMS’s just don’t do that very well, and the performance is really terrible for fast read operations. By contrast, Cassandra stores that type of data exceptionally well and the performance is fantastic. We went on from there and just decided to store everything in Cassandra.

Original title and link: Cassandra at Workware Systems: Data Model FTW (NoSQL database©myNoSQL)

Categories: Blogs, NoSQL

Percona Server 5.5.23-25.3 released!

MySQL Performance Blog - Wed, 05/16/2012 - 13:09
Percona is glad to announce the release of Percona Server 5.5.23-25.3 on May 16, 2012 (Downloads are available here and from the Percona Software Repositories). Based on MySQL 5.5.23, including all the bug fixes in it, Percona Server 5.5.23-25.3 is now the current stable release in the 5.5 series. All of Percona‘s software is open-source and free, all the details of the release can [...]
Categories: Blogs, MySQL, Open Source

Firebird 2.5.2 uploaded in repository for the next Ubuntu release 12.10

Firebird News - Wed, 05/16/2012 - 12:40
I have uploaded firebird2.5 2.5.2~svn+54476.ds4-1 in the Quantal Firebird ppa for testers , It is the same package from Debian Sid
Categories: Open Source