Changes for document Session Big Data

From version 11.1
edited by Catherine Nuel
on 2012/11/14 12:01
To version 12.1
edited by Selvalakshmi R
on 2012/11/16 11:35
Change comment: There is no comment for this version

Metadata changes

Document author

Content changes

... ... @@ -3,28 +3,25 @@
3 3
4 4 ===== Keynote Big data =====
5 5 **Speaker**: Jim Walker, HortonWorks
6 -**Schedule**: 12:00 - 12:15am
6 +**Schedule**: 10:00 - 10:15am
7 7
8 8 ===== SpagoBI and Big Data: next Open Source Information Management suite =====
9 9 **Speaker:**Monica Franceschini, Engineering
10 -**Schedule:** Thursday Nov 29, 12:15 - 12:30pm
10 +**Schedule:** Thursday Nov 29, 10:15 - 10:30am
11 11 **Abstract:** Organizations adopt Business Intelligence tools to analyze tons of data: nonetheless, several business leaders do not dispose of the information they actually need. This happens because the information management scenario is evolving. Various new contents are adding to structured information, supported by already known processes, tools and practices, including information coming from social computing. They will be managed by disparate processes, fragmented tools, new practices. This information will combine with various contents of enterprise systems: documents, transactional data, databases and data warehouses, images, audio, texts, videos. This huge amount of contents is named “big data”, even though it is not just related to a big amount of data. It refers to the capability of managing data that are growing along three dimensions - volume, velocity and variety - respecting the simplicity of the user interface. The speech describes SpagoBI approach to the “big data” scenario and presents SpagoBI suite roadmap, which is two-fold. It aims to address existing emerging analytical areas and domains, providing the suite with new capabilities - including big data and open data support, in-memory analysis, real time and mobile BI - and following a research path towards the realization of a new generation of SpagoBI suite.
12 12
13 13
14 -===== Talend: The Big Challenge of Big Data and Hadoop Integration =====
14 +===== The Big Challenge of Big Data and Hadoop Integration =====
15 15 **Speaker:** Cedric Carbone, Talend
16 -**Schedule:** Thursday Nov 29, 12:30 - 12:45pm
16 +**Schedule:** Thursday Nov 29, 10:30 - 10:45am
17 17 **Abstract:** Enterprises can't close their doors just because integration tools won't cope with the volume of information that their systems produce. As each day goes by, their information will become larger and more complicated, and enterprises must constantly struggle to manage the integration of dozens (or hundreds) of systems. Apache Hadoop has quickly become the technology of choice for enterprises that need to perform complex analysis of petabytes of data, but few are aware of its potential to handle large-scale integration work. By using effective tools, integrators can process the complex transformation, synchronization, and orchestration tasks required in a high-performance, low cost, infinitely scalable way. In this talk, Cédric Carbone will discuss how Hadoop can be used to integrate disparate systems and services, and provide a demonstration of the process for designing and deploying common integration tasks.
18 18
19 19
20 20
21 -===== BPMconseil: Using Vanilla to manage Hadoop database =====
21 +===== Using Vanilla to manage Hadoop database =====
22 22 **Speaker:**Patrick Beaucamp, Bpm-Conseil
23 -**Schedule:** Thursday Nov 29, 12:45 - 01:00pm
23 +**Schedule:** Thursday Nov 29, 10:45 - 11:00am
24 24 **Abstract:** This presentation will demo how to use Vanilla to read/write data in Hadoop database, using big data database like HBase or Cassandra, along with the use of Hadoop-Ready Solr/Lucene search engine - embeded into Vanilla - to run clustered search on Hadoop data.
25 25
26 26
27 -===== PKU: Tracking code evolution for open source universe =====
28 -**Speaker:**Minghui Zhou, Peking University
29 -**Schedule:** Thursday Nov 29, 02:00 - 02:15pm
30 -**Abstract:** The existing large amount of OSS artifacts has provided abundant materials for understanding how code is reused in open source universe, in particular, what code pieces are mostly reused, in what circumstances people reuse code, and so forth. Understanding this process could help with legacy software maintenance, as well as help to explore best practice of software development. Targeting the change history data of thousands of open source projects, we try to answer the following question: First, how is code reused by other projects? Second, how are code files organized in project and how does this organization structure change over time? To answer these questions, there are several technical difficulties we have to overcome. For example, because of the different kinds of VCSs, it is hard to figure out a uniform model which can represent the evolution progress of code files stored in them. Also, each VCS may have its own data format, so, extracting data from them is a big challenge. Furthermore, using current software algorithm and hardware platform to analyze the version iteration and reuse information of about a billion code files is another challenge.
27 +11:00 - 11:15 : Coffee Break