• 5-year Strategic Plan on Data Science Development

    5-year Strategic Plan on Data Science Development

    This guideline is based on my personal experiences in tens of data analytics & data science projects and consulting works for the last decade.  It should be fit for different organizations including corporations, non-profit organizations and institutions. Here is my suggestions on possible actions throughout a 5-year periods:     Year 1:  Building the Foundation…

    Continue reading

  • Importance of Real-time Data Analytics

    Importance of Real-time Data Analytics

    Real-time (or better say near real-time) analytics make sense of all the real-time data that passing around an organization.  Once a business is able to analyze data in real-time, they can generate insights during data streaming instead of storing and analyzing it in batches. Traditionally, data analysis happens once the data has been captured and…

    Continue reading

  • Importance of Primary Key

    Importance of Primary Key

    I have been working with relational database, NoSQL database and Hadoop HBase for years mainly for storing data for analysis.  There is a fundamental problem for people still overlooking the importance of primary key in a table.  In this article, I would like to go through the detailed information about primary key such as: definition,…

    Continue reading

  • Career Opportunity

    My team (SDI) is hiring the position of Assistant Data Engineer (trainee) based in Hong Kong.   It is an opportunity to develop your data science career. Our client includes: Fortune 500 Government Departments Listed Corporations Statutory Bodies and … many more Please check the job ads: LinkedIn: https://www.linkedin.com/jobs/view/122773078/ WWW: https://smartdatainstitute.com/jobs    

    Continue reading

  • A Low Cost Data Repository for Data Science Project

    A Low Cost Data Repository for Data Science Project

    In the past month, I have made a number of research and testing on different data science technology including on-premise and cloud solutions.   They are: Google BigQuery, Cloud Storage, Airflow Qilk Sense, Data Streaming (CDC), Data Warehouse Automation, Data Lake Creation Debezium CDC EnterpriseDB, GaussDB (by Huawei) In the coming future, I will share some…

    Continue reading

  • Different Stages in Data Science Team Building

    Different Stages in Data Science Team Building

    In general, there are 3 stages of building a data science team. Early Stage Mid Stage Late Stage (Mature Stage) For each stage, the team structure of a data science team should be different in order to meet the needs within the business or organization. Organizational Models Meanwhile, there are 3 different types of Organizational…

    Continue reading

  • Five Ways to Build a Data Science Team for your Organization

    Five Ways to Build a Data Science Team for your Organization

    As a data science consultant for years, I would like to share my viewpoints and experiences of developing the capability for a data science team for any organization. (NOTE: this article is aimed to share building data science for general business.  However, it is not for helping you to build a Data Science consultancy or…

    Continue reading

  • VPN Service – More than Security & Privacy

    VPN Service – More than Security & Privacy

    There are many people who use VPN to surf the Internet for various reasons.  The big question is for which provider is more reliable and worthwhile. In addition to hiding your identity, you may think that it is not worth to pay so much for only anonymous surfing and free services are available. However, there…

    Continue reading

  • Basic Linux Server Security

    Basic Linux Server Security

    There are many developers and data science experts using cloud servers to do development or data science projects.  However, I have found that most of them are always overlooking the basic security protection for these servers. I would like to declare that I am NOT a CISSP.  Nevertheless, I would like to share my experiences…

    Continue reading

  • Forgettable CentOS Server

    Forgettable CentOS Server

    There was an announcement from CentOS team.  CentOS is going to farewell by the end 2021 and to be replaced by CentOS Stream.  CentOS Steam will be no longer positioned as a server OS and releases just ahead of a current RHEL version. Original Announcement: CentOS Project shifts focus to CentOS Stream – Blog.CentOS.org This…

    Continue reading

0Shares