Material Big Data

Lanzados ppts informativos de tecnologías BigData: Hadoop, Hbase, Hive, Zookeeper...

Te presentamos la mejor plataforma de Planificación y Presupuestacion BI

Forecasts, Web and excel-like interface, Mobile Apps, Qlikview, SAP and Salesforce Integration...

Pentaho Analytics. Un gran salto

Ya se ha lanzado Pentaho 7 y con grandes sorpresas. Descubre con nosotros las mejoras de la mejor suite Open BI

La mejor oferta de Cusos Open Source

Después de la gran acogida de nuestros Cursos Open Source, eminentemente prácticos, lanzamos las convocatorias de 2016

30 dic. 2016

Conoce las novedades de Jedox 7 en este video


En este vídeo puedes ver una presentación de las novedades de Jedox 7, la mejor herramienta Business Intelligence para planificación, presupuestación, ratios, reglas de negocio y forecasts

21 dic. 2016

New Search and Tags functionalities in Pentaho Console


Hi, if you are a Pentaho user or Admin, and you are managing a 'production environment' where the number of folders, reports, analysis and Dashboards increase day by day it's very useful a way to quickly identify the right element you want to open.

That´s why we´ve created this component that allows you to:

- Search by folder
- Add tags and comments for any element
- Search by any word of title, tags, and comments
- Select by any tag
- Search by date of creation or modification
- Filter by type of element: Report, OLAP or Dashboard

You can see in action here in this Online Demo



Select by Date of creation or modification








Select by type of element, tag, date and text search









Add tags and description






20 dic. 2016

Santander y BBVA trasladan su competencia al Business Intelligence

Tanto el Banco Santander como BBVA, trasladan su competitividad al Business Intelligence. Decimos Business Intelligence, en lugar de Big Data, como suelen promocionar, pues ambas aplicaciones, de momento, tienen más de lo primero que de lo segundo. Probablemente, con el tiempo usen más de lo segundo

La cuestión es: Tendrá éxito realmente entre los comercios? Están preparados y formados para usar herramientas de Business Intelligence?

Os contamos:

La de Santander se llama: Mi Comercio




Mi Comercio cuenta con tres funcionalidades básicas:
  • ‘Mi Facturación’ recoge las totalizaciones realizadas por los TPVs en los últimos 15 días, incluyendo el detalle de estas operaciones.
  • ‘Mis Clientes’ recopila mensualmente datos agregados de aquellos clientes nuevos y recurrentes que han comprado en el comercio y en los de la competencia cercana. Con esta información, las empresas y los autónomos pueden tomar decisiones de negocio al acceder a información como la hora del día a la que más compran sus clientes, si están captando más clientela que su competencia,  en qué otros sectores de actividad suelen comprar las personas que acuden a sus negocios, etcétera.
  • ‘Ayuda y Soporte’, responde a las preguntas más frecuentes de los clientes y ofrece los teléfonos de atención para los usuarios de TPVs a un solo click.


La del BBVA se llama: Commerce 360




  • Accede mes a mes a los datos de compras de tu TPV BBVA y compáralos con la actividad comercial de las empresas de tu zona y sector para tomar decisiones útiles para tu negocio.
  • Te ofrece datos objetivos sobre de la fidelidad de tus clientes, de sus segmentos demográficos y de sus principales códigos postales de procedencia.
  • Compara estos indicadores con los de tu zona para identificar oportunidades de mejora en horarios comerciales, precios o acciones de marketing.
  • Todo esto sin coste por tener el TPV con BBVA.

19 dic. 2016

iD v2 is now available on OpenStreetMap



The web-based iD editor is designed to help create an even better, more current OpenStreetMap by lowering the threshold of entry to mapping with a straightforward, in-browser editing experience.

Head over to OpenStreetMap and start editing today! You can make meaningful contributions with just a few minutes of training.
You can also help OpenStreetMap by donating to the OpenStreetMap Foundation’s 2016 funding drive. Donate today and your gift will go even further because Mapbox is matching €10,000 of donations.
Check out iD on Github to contribute code, make suggestions, or report an issue.

Google open sources Embedding Projector for high-dimensional data



Good news for open source data visualization fans: Google open sources Embedding Projector for high-dimensional data

The tool will help machine learning researchers to visualize data without having to install and run TensorFlow.
Dimensionality, and vectors in general, is not something that most of us find easy to understand. 
The problem is that we all live in a three-dimensional world. We are taught length, width and height, so we struggle to imagine what a forth, fifth or sixth dimension might look like — this is why most of us found Christopher Nolan’s representation of additional dimensions wonky in the movie Interstellar.



To enable a more intuitive exploration process, they e are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow
They are also releasing a standalone version at projector.tensorflow.org, where users can visualize their high-dimensional data without the need to install and run TensorFlow.

14 dic. 2016

A quick review of STPivot4 Open Source OLAP tool

STPivot4 Open Source OLAP tool                                

STPivot4 is based on old Jpivot and Pivot4J projects, now not in progress, where we´ve included, improved and strengthened many new functionalities mentioned below as technical features.

STPivot4 includes an innovative work space for selecting your query that allows end users work easily using drag and drop. End user can identify quickly which dimensions, measures or filters in order to work with them.  Now, you can search, filter, rank and select in order to refine your queries as a first approach previous a query, avoiding waiting for long query response times. 
Has been improved design, usability, graphs and, in summary,  easy to use and manage for end users.

STPivot4 supports Mondrian 4, so it allows grant scalability, compliance and performance improvements and, working as a Pentaho plugin, working wih last available Pentaho versions.




Main Features and Download

You can download open source code from Github. We´ll be grateful of helping you in your Business Intelligence projects using Open Source tools if you need support, development and consultancy. We´d like to receive your feedback: info@stratebi.com


  • Cube Selector

    We've created a new popup window where end users can  easily select dimension values, measures, levels... for their queries. It includes a new search feature that improves value selection with high cardinality dimensions. 
    In your design window, end users can drag and drop their dimensions, filters and measures quickly and easily. 

  • New search functionality

    One of the best new features of STPivot is the ability of search dimension values easily, when you manage a great number of values.
    This is very helpful when you need to identify your desired values on each level/dimension/hierarchy in order to include them in our query result. 

  • Drag and Drop query design and build 

    If sometime you wanted to build your queries easily and quickly, with this visuall drag and drop design now it´s possible. 

  • Filter and drill to detail

    One of the best functionalities of any OLAP Viewer is the possibility of drill through any dimension and measure in order to get powerful insights about yor data models.

  • Advance Filters

    It´s included advance filters within the Selector, so you can leverage all the power of OLAP cubes, refining your queries and nesting each filter. 

     Ranking Top Count 
     Ranking Bottom Count
     Order 
     Visual Totals
     Filter 
     Limit First/Last

  • Graphics and Visualization 

    STPivot includes a great variety of graphic libraries (pie, chart, heatmaps, line, bar...) fully configurable with popup information for any of your analytical needs. 


  • Calculator

    All the simplicity and power for end users, so they can directly create their own formulas with a friendly interface, in order to include them in their OLAP views. 

Roadmap

We are working on new functionalities for STPivot. Some of them are listed below: 

  •  Complex Formula Editor
  •  Create calculate members 
  •  Analysis Wizard 
  •  What If 
  •  Undo Feature 
  •  Improving user interface, performance and integration 
  •  New 'cool' ideas... 

 

Location Intelligence for Indoor Maps


Carto, herramienta de visualización Geoespacial de la que somos partners. En esta aplicación de análisis de tráfico en 'near real time', la podéis ver en funcionamiento junto a Pentaho, lanza una funcionalidad muy interesante:

Análisis Business Intelligence en ubicaciones (Location Intelligence) indoor (es decir, grandes oficinas, centros comerciales, universidades, edificios públicos o deportivos, etc...). Las posibilidades son enormes.

Nuestros compañeros de Carto nos indican:

"Indoor maps often direct users to emergency exits, which has limited our context of mapping to external geographical spaces. With the rise of Indoor Positioning Systems (IPS), however, the field of data visualization is turning inward to pioneer new paths to purchase with indoor maps.


Situm, a member of Telefónica’s Open Future initiative, and known as the “GPS for indoor” start-up, analyzes indoor traffic for various sectors using location intelligence. Despite an exponential rise in mobile purchasing, the Department of Commerce reports that 90 percent of retail purchases are transacted offline, which means managing in-store traffic is crucial to maintaining a competitive edge. But aside from providing directions for customers, what, exactly, can IPS offer? Well, as we learned during a recent collaboration with Situm, the answer is a lot"


Las predicciones de Pentaho para 2017

















Ver una Demo Online de Pentaho CE
  • Self-service data prep will unlock big data’s full value. Organizations building advanced, big data deployments like the ones needed to accurately predict election outcomes are buckling under huge, diverse data volumes. The amount of time spent simply preparing data is overwhelming organizations struggling for resources and time. This is often to the tune of anywhere between 50-70% of IT time spent preparing data. That sentiment data I mentioned only exacerbates this problem needing to be continually ingested from a huge universe of social network feeds and prepared for analysis. Self-service visualization tools that can only analyze data after it’s been prepared are diminishing in value. Our customer Sears Holdings does spot checks and visualizes its data throughout its lifecycle, which enables it to make more valuable data-driven decisions - in time for them to matter - while reducing costs. Expect more software vendors in 2017 to follow our lead and start offering tools that bridge the gap between analytics and data prep with an integrated experience for both.

  • Organizations are replacing self-service reporting with embedded analytics. As I first predicted in 2015, embedded analytics would become ‘the new BI’. We are now really starting to see our vision of ‘next generation applications” mature and replace self-service reporting. Organizations can see that analytics are an expectation and must be embedded at the point of impact regardless of the end-users sophistication. In our customer CERN’s case, this involves 15,000 users in various operational roles accessing Pentaho analytics from their normal line-of-business applications.

  • IoT’s adoption and convergence with big data will make automated data onboarding a requirement. This year predictive maintenance became a marquis use case for IoT’s ROI potential and this will continue to gather speed in 2017. Everything from shipping containers to oil-drilling screws to train doors is being fitted with sensors to track things like location, operating status and power consumption. And speaking of trains, expect to hear more about our project with Hitachi Rail to build ‘self-diagnosing’ trains that can detect if a problem is brewing on a train to either be taken out of service or repaired before the failure has taken place. In order to ingest, blend and analyze the massive volumes of data all these sensors generate, more businesses will need to be able to automatically detect and onboard any data type into its analytics pipeline. This is simply way too big, complex, fast-moving and mind-numbing a job for overburdened IT teams to handle manually

  • 2017’s early adopters of AI and machine learning in analytics will gain a huge first-mover advantage in the digitalization of business. Big data and IoT use cases in business and industry are approaching the data variety, volume and velocity levels of large-scale scientific models for which AI and machine learning were originally conceived. Early adopters gain a jump start on the market in 2017 because they know that the sooner these systems begin learning about the contexts in which they operate, the sooner they will get to work mining data to make increasingly accurate predictions. This is just as true for the online retailer wanting to offer better recommendations to customers, a self-driving car manufacturer or an airport seeking to prevent the next terrorist attack.

  • Cybersecurity will be the most prominent big data use case. As with election polls, detecting cybersecurity breaches depends on understanding complexities of human behavior. Accurate predictions depend upon blending structured data with sentiment analysis, location and other data. BT’s Assure Cyber service, for example, uses Pentaho to help detect and mitigate complex and sustained security threats by blending event data and telemetry from business systems, traditional security controls, advanced detection tools among others.


7 dic. 2016

Available new Open Source OLAP viewer, STPivot4




STPivot4 is based on the old Pivot4J project where functionality has been added, improved and extended. These technical features are mentioned below.



GitHub STPivot4
For additional information, you may visit STPivot4 Project page at http://bit.ly/2gdy09H

Main Features:
  • STPivot4 is Pentaho plugin for visualizing OLAP cubes.
  • Deploys as Pentaho Plugin
  • Supports Mondrian 4!
  • Improves Pentaho user experience.
  • Intuitive UI with Drag and Drop for Measures, Dimensions and Filters
  • Adds key features to Pentaho OLAP viewer replacing JPivot.
  • Easy multi-level member selection.
  • Advanced and function based member selection (Limit, Ranking, Filter, Order).
  • Let user create "on the fly" formulas and calculations using
  • Non MDX gran totals (min,max,avg and sum) per member, hierarchy or axis.
  • New user friendly Selector Area
  • and more…


6 dic. 2016

7 Ejemplos y Aplicaciones practicas de Big Data


En las siguientes Aplicaciones, Cuadros de Mando y ejemplos podéis ver el funcionamiento práctico del Big Data en diferentes casos y usando diferentes tecnologías: Kafka, Spark, Apache Kylin, Neo4J....

Acceder a los ejemplos

Si quieres saber más de Big Data, te pueden interesar estos enlaces:

OLAP for Big Data. It´s possible? 
Como empezar a aprender Big Data en 2 horas
List of Open Source Business Intelligence tools
Analysis Big Data OLAP sobre Hadoop con Apache Kylin (spanish)
Caso de uso de Apache Kafka en tiempo real, Big Data
 (spanish)


1 dic. 2016

Lanzamiento de Jedox 7 y Novedades


Se acaba de presentar la versión 7 de una de las mejores soluciones para Planificación y Presupuestación Financiera y de Ventas, Jedox 7

Apúntate al webinar gratuito en español para el próximo 20 de Diciembre de 15:30h a 17:30h



A continuación, te contamos las novedades, mejoras, etc... En este enlace tienes otros posts que hemos publicado sobre Jedox


Press Release oficial sobre el lanzamiento

Jedox 7:

- Web en inglés con las novedades en Jedox 7

Jedox 7 is a true game-changer: Download our free "What's New" whitepaper and get all the details on smart modeling tools that bring your planning quickly up to speed, new design capabilities, enhancements to our innovative GPU technology, and so much more.



Jedox Models: Planning Made Simple

We are proud to introduce four all-new Jedox Models for Profit & Loss, Cost Center, Sales and Human Resources.

In 2017, Jedox and their partners will continue to provide a growing portfolio of these predefined and configurable planning applications through the new Jedox Marketplace. 

Discover how you can kickstart and improve your planning processes with our new Jedox Models.

Jedox Models 






25 nov. 2016

Business Intelligence for Hadoop Benchmark


Quite interested this Benchmark you can download from atscale, where you can find insights about Business Intelligence on Hadoop

If you are interested, check also our posts:

OLAP for Big Data. It´s possible?
List of Open Source Business Intelligence tools
Analysis Big Data OLAP sobre Hadoop con Apache Kylin (spanish)
Caso de uso de Apache Kafka en tiempo real, Big Data (spanish)

About the Benchmark:

Key Findings:

  • SQL-on-Hadoop engines are well suited for Business Intelligence (BI): All tested engines – Hive, Impala, Presto,and Spark SQL – successfully executed all of the queries in our benchmark suite and are stable enough to support business intelligence workloads.

  • There is no single “best engine”: We continue to see the different engines shine in different areas. Depending on raw data size, query complexity, and the target number of end-users enterprises will find that each engine has its own ‘sweet spot’.

  • Version-to-version improvements are significant: The open source community continues to drive significant and rapid improvements across the board. All engines tested showed between 2x to 4x performance gains in the six months between the first and second edition of the benchmarks. This is great news for those enterprises deploying BI workloads to Hadoop.

  • Small vs. Big Data: Impala and Spark SQL continue to shine for small data queries (queries against the AtScale Adaptive Cache). New in this edition, the latest release of Hive LLAP (Live Long and Process) shows suitable “small data” query response times. Presto also shows promise on small, interactive queries.

  • Few vs. Many Users: While Impala continues to shine in terms of concurrent query performance, Hive and SparkSQL showed improvements in this category. Presto, new to this edition of the benchmarks, showed the best results in our user concurrency testing.


20 nov. 2016

Tipos de roles en Analytics (Business Intelligence, Big Data)



Conforme va creciendo la industria de Analytics, se hace más dificil conocer las descripción de cada uno de los roles y puestos. Es más, generalmente se usan de forma equivocada, mezclando tareas, descripciones de cometidos, etc...

Esto lleva a confusión tanto a los propios especialistas, como a las personas que están formandose y estudiando para realizar estos trabajos. En una industria tan cambiante es frecuente la aparición y especialización de diferentes puestos de trabajos. Aquí, os detallamos cada uno de ellos:


Business Analyst:




Data Analyst:



Data and Analytics Manager:


Data Architect:



Data Engineer:



Data Scientist:



Database Administrator:



Statistician:





Te puede interesar tambien:

Como pasar una entrevista con Pentaho BI Open Source?
Skills en Data Analysts y sus diferencias
Empezar a aprender Big Data en 2 horas?

Visto en Kdnuggets

17 nov. 2016

Cuadros de Mando y Business Intelligence para Ciudades Inteligentes


Cada vez son más las ciudades que están implementando soluciones de Ciudades Inteligentes, Smart Cities... en donde se abarcan una gran cantidad de aspectos, en cuando a tecnologías, dispositivos, analítica de datos, etc...

Lo principal en todos ellos es que son soluciones que deben integrar información e indicadores diversos de todo tipo de fuentes de datos: bases de datos relacionales tradicionales, redes sociales, aplicaciones móviles, sensores... en donde es fundamental que no haya islas o tecnologías cerradas, por lo que el Open Source es fundamental, pues se puede adaptar a todo tipo de soluciones

En base a nuestra experiencia en algunos de estos proyectos de ciudades inteligentes en los que hemos participado, queremos compartir unos cuantas tecnologías, recursos y demos que os pueden ser de ayuda:

1. List of Open Source solutions for Smart Cities - Internet of Things projects

2. List of Open Source Business Intelligence tool for Smart Cities 

3. 35 Open Source Tools para Internet of Things (IoT)



Demos:

Tecnologías Big Data

Demos Business Intelligence





Seguimiento del tráfico near real time en el Ayuntamiento de Madrid (Acceso)



Geoposicionamiento de rutas dinámicas (Acceso/Video)




Recomendación de Rutas (grafos) (Acceso/Video)



Como empezar a aprender Big Data en 2 horas



Big Data es uno de los hitos de estos últimos años. Son muchas las personas que quieren acercarse y conocer, primero lo más básico, para tener unas nociones generales. Pero resulta complicado encontrar una rápida guía, que en un par de horas, sirva para 'defendernos' en esto del Big Data, máxime si no se tienen altos skills técnicos

Por ello, hemos recopilado una serie de infografías, presentaciones, webinar, demos y documentación para que podáis tener una primera visión del Big Data en 2 horas!!


1. Infografías
     

















2. Webinar



Ver en formato Presentación



3. Demos




Ver Demos Online



4. Claves-Presentaciones








5. Libro Verde del Big Data

























Mas info? Escríbenos