"A last-generation airplane may generate more than 500 GB of data in one single flight"
"We are still in the midst of an industrial economy, but it seems that we are shifting towards a digital process-based economy"
"The so-called Business Intelligence is now starting to move towards BigData as a consequence of the conjunction of several trends"
The transformation power of data is probably the most important driving force in the development of smart city models. Digital processes have now taken the place of analog processes. Information is no longer obtained by observation and analysis, but gathered from data and analytics. Reality is no longer looked at, but quantified with sensors; In urban, industrial, financial or even healthcare settings, we no longer perceive processes and events themselves, but their digital representation created from data captured by sensors and other reality digitalization systems.
Data have existed since time immemorial. But the mere act of manually collecting such data already gave them some structure in a sense. Jobs involving data transfer from hand-written forms had their golden age, and data recorders were at work for many years. Databases were initially structured and designed using semantics that were already defined in the initial design. The amount of data obtained by means of manual processes is far from negligible, and systems were designed for the processing of large amounts of data. As time has gone by, data obtained from human activities has been captured with form digitalization systems, and data from human-machine interactions, as well as data from machines, have also been obtained. In the latter case, we find ourselves in the age of the Internet of Things (IoT), where data are no longer structured. Sensors measure noise, temperature, air quality or a person’s heartbeat, and the semantics of data systems is no longer as straightforward as it used to be; neither is their structure. The design of systems that make it possible to extract information from certain knowledge, such as “when pollution levels are high, there is about a 10% increase in a person’s average heartbeat when doing sports” certainly poses a challenge. Big Data is the global expression that groups all the aforementioned changes, including the amount, variety and velocity of data generation. As a case in point, a last-generation airplane may generate more than 500 GB of data in one single flight.
The transformation of companies
Companies that focus on information extraction from data had to adapt to the digitalization process. Oracle, IBM, SAP or Teradata play a significant role in data processing, which is ultimately essential to underpin the “smart” component of an industrial process or a city. Energy companies, as well as communication or infrastructure companies, are already working with data analysis systems provided by the companies mentioned above.
The case of Teradata is of particular interest. All along the way, the company has been involved in the definition and concision of data-processing trends as well as the adjustment to technological changes that define innovation roadmaps. Teradata, a company that specializes in data centers, Business Intelligence, data analytics and Big Data, has strengthened its position as a reference for data management with their latest innovations included in its portfolio of solutions and products. The new version of its Teradata Database software is ready to make the most of their Active Enterprise Data Warehouse systems while polishing up their Unified Data Architecture (UDA).
Their new assets and strategy for the next few months were presented in the European edition of their Teradata Universe meeting, which was held in Prague from April 6th to April 9th. The 19th edition gathered more than 1000 professionals from different sectors, including marketing, finances, public sector, insurances, logistics, retail, communications, transports and healthcare, among others. In many cases, such sectors are deeply involved in building up smart cities.
Data have become an essential part of digital economy. We are still in the midst of an industrial economy, but it seems that we are shifting towards a digital process-based economy, and data are the raw material from which such processes are created. The so-called Business Intelligence is now starting to move towards BigData as a consequence of the conjunction of several trends. Digital and communication technologies make data generation faster, easier and cheaper, and the amount of data that may be generated increases exponentially. Besides, equipments and systems in charge of data processing increase their performance and velocity with each evolutionary iteration, either in microprocessors, storage or memory. Apart from all that, companies are giving more and more value to data in the context of their analysis, and they are starting to monetize it. This is a significant competitive advantage in terms of getting new clients, keeping current clients and venturing into new research and development areas in a world that constantly changes.
Application fields become more diverse as new data sources are added. Data sources may be social networks for Marketing, sensors for the Internet of Things, smart measurement systems for energy consumption or even sensors in manufacturing and industrial production. As data become more diverse, both in their source and their structure (or lack thereof), Teradata offers solutions for data analysis from a unification perspective. Unified Data Architecture (UDA) is Teradata’s interpretation of Logical Data Warehouse (LDW) combined with the know-how acquired for several decades regarding Enterprise Data Warehouse (EDW). UDA includes several platforms such as Data Lake, the platform for Data Discovery and IDW (Integrated Data Warehouse). This makes it possible to work on Big Data applications and provide tailored solutions to the current needs of companies. Hermann Wimmer, president of Teradata, put forward several definitions of Big Data in his presentation, which included the technological perspective (“a set of technologies that make it possible to store and analyze large amounts of data”), the functional perspective (“Big Data enables companies to manage all their data”) and even the philosophical point of view (“Big Data turns data and analytics into essential elements for competitiveness”). In fact, the term “Big Data” is a relatively broad concept that describes a phenomenon resulting from a side-effect of the digitalization of all aspects of life. Digitalization turns analog entities into data that may “only” be used to rebuild the analog signal, or else may be processed using microprocessor-based technologies in order to obtain a result. When data-processing technologies add a semantic layer on them, analytics then plays a role in the extraction of information.
The answer to market trends
Teradata has accumulated know-how in the area of data analysis for more than 35 years. It chose the “appliances” approach, which optimally fit in both software and hardware. Above all, it is an engineering company, and as such its systems are designed so that they make a difference in this field: adopting tailor-made technologies and solutions for the needs to be addressed. The solutions introduced in the Prague Annual Meeting were the answer, from an engineering point of view, to market trends, currently focused on the “Logical Data Warehouse”, which includes several systems, analysis techniques, programming languages and several sorts of data. Teradata 15 makes it possible (in its technological database dimension) to extend system possibilities, whereas new software products such as Teradata QueryGrid make it possible to manage UDA-derived resources -the unified data architecture by Teradata; according to Gartner, UDA is the support for the Logical Data Warehouse- that enable access to data obtained from Hadoop settings, for example. It is now possible to manage multi-structured and dynamic data sources from JSON or XML, as well as manage not only SQL, but also Java, Perl, Ruby, Python or R. 3D representation is also added by introducing the “Z coordinate”, which opens the door to the analysis of new variables in markets such as telecommunications, insurances and energy prospecting. It should not be forgotten that data visualization is an essential component in the introduction of the Logical Data Warehouse, where it is convenient and desirable to extract information in a fast manner and in a such a way that it is ready for decision-making.
The crown jewel: EDW 6750
True to its “appliance” philosophy, Teradata also introduced its Active EDW 6750, both for a classic approach to data analysis and for its integration in unified platforms. This is an Enterprise Data Warehouse with the latest SSD storage technologies, the most efficient processors by Intel, as well as RAM smart management. Other manufacturers and companies devoted to the development of database systems have chosen memory processing of data. However, this is only an ideal solution when relatively small data sets are used. Teradata works with large amounts of data, which make them incompatible (from an economic point of view) with memory processing. This is why they have developed an engineering solution that makes it possible to selectively transfer to memory only those data that are relevant to analytical processing. Based on algorithms that may measure data “temperature”, Teradata may obtain outstanding performance without having to work with all data in the system memory. According to their reports, 90% of input/output operations depend on 20% of data. The main issue lies on the dynamic nature of this 20%, and so Teradata has developed algorithms that may move data to and from the system memory so that as many calculations as possible are performed there. In fact, a recent benchmark that worked with amounts of data in the Petabyte range achieved excellent results: 95% of the input/output operations were done in the system memory, despite the hardware dimension of the system being much smaller than that of systems purely based on “in memory” technologies.
Data are the present… and the future
On the way towards digital economy, data are an essential component for development. Companies that cannot or will not work with data are doomed, and may even disappear. In the Smart Cities context, data are the raw material required for its build-up from the business, personal and social point of view. Data make it possible to build useful and relevant services in such areas as transport, energy, communications, healthcare or culture and leisure. Companies as Teradata, specialized in working with data, are expected to offer systems that are as efficient and optimal as possible, preferably using standard systems and interoperability with other systems and technologies.
SSD storage is essential to improve system response when working with Big Data, and it makes it possible to design intelligent hyerarchies for storage, which preclude having to keep all data in RAM.
Towards the corporate Logical Data Warehouse
Regardless of the technological solutions offered to make data cross-sectional, tools that provide value to the technology are still required. UDA makes it possible to integrate data from various sources -no matter if stored in different file systems, with or without structure- using Teradata Database 15 and EDW systems from the new 6750 series, or appliances containing Hortonworks or Aster Database systems. Nevertheless, in order to obtain value from data, a tool is required that makes it possible to process such data in a unique way, either from the user’s point of view, from the data scientist perspective or from the perspective of the corporate agent in charge of gathering information, making decisions, or both.
QueryGrid is the answer to these needs. It will be available from the third trimestre of 2014 onwards, but it has already been one of the most relevant subjects in Teradata announcements for the next months.
Teradata Database 15
Not only SQL
Teradata Database 15 has been another leading subject in these months. The company’s benchmark database system caters for trends in the sector and adds all the elements and features required by companies working with data. Data are the new currency, as stated Hermann Wimmer, president of Teradata, in his presentation in the event that took place in Prague last April. Teradata 15 offers the required tools to manage data sources from different origins, as well as to manage several programming languages and make the use of algorithms for the optimization of data analysis-related tasks more flexible. Its compatibility with JSON opens the door for applications in the Internet of Things, and multi-structured data, as well as structured data, may be managed.
Teradata Active EDW 6750
Silicon hosting intelligence
Teradata is determined to offer its data analysis solutions as a set of software and hardware that are optimized to function fast and efficiently. Teradata engineering solutions focus on dimensioning storage and memory systems in order to offer realistic solutions; one example of that would be smart memory management based on metrics such as data temperature. Such a holistic approach implies a substantial initial investment for companies, but it allows them to avoid the integration and optimization processes that should be performed internally if the hardware and software were bought independently. Some trends, such as “cloud” or making hardware less expensive should not be overlooked by Teradata, but right now systems such as Active EDW 6750 are appropriate for the introduction of data analysis in big companies.
The EDW 6750 is scalable depending on the company needs, and its equipment includes the most recent technologies in Intel Xeon processors, RAM memory or solid state storage, hybrid state storage or conventional state storage dimensioned as required.