IOTA Distributed Ledger: Beyond Blockchain for Supply Chains – The New Stack

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

<meta name="x-tns-categories" content="Data / Operations / Storage“><meta name="x-tns-authors" content="“>

IOTA Distributed Ledger: Beyond Blockchain for Supply Chains – The New Stack

Modal Title

2023-01-25 11:55:14

IOTA Distributed Ledger: Beyond Blockchain for Supply Chains

sponsor-scylladb,sponsored-post-contributed,

Explore how the IOTA Foundation is tackling supply chain digitization in East Africa, including the role of open source distributed ledgers and NoSQL.


Jan 25th, 2023 11:55am by


Featued image for: IOTA Distributed Ledger: Beyond Blockchain for Supply Chains

Image via Pixabay.

The IOTA Foundation, the organization behind the IOTA open source distributed ledger technology built for the Internet of Things, envisions a future where every single trade item in the global supply chain is tracked and its provenance sourced using distributed ledgers. This vision is already becoming a reality in East Africa, thanks to the collaboration of the IOTA Foundation and TradeMark East Africa (TMEA). These organizations have teamed up to address the challenge of digitizing the export process for Kenya’s flower exporters, airlines and freight forwarders.

TMEA found that for just a single transaction, an African entrepreneur was completing an average of 200 communications, including 96 paper documents. The system developed by the IOTA Foundation and TMEA anchors the key trade documents on the Tangle, a new type of distributed ledger technology different from the traditional blockchain model, and shares them with customs in destination countries. This expedites the export process and makes African companies more competitive globally.

What’s behind this initiative from a technology perspective? That’s what José Manuel Cantera, technical analyst and project Lead at IOTA Foundation, recently shared. From a bird’s-eye view, it involves using:

  • EPCIS 2.0 data serialization formats for data interoperability
  • IOTA distributed ledgers to register every event happening within supply chains
  • ScyllaDB NoSQL for scalable, resilient persistent storage

Let’s dive into the details with a close look at two specific use cases: cross-border trade and end-to-end supply chain traceability. But first, Cantera’s perspective on the technical challenges associated with supply chain digitization.

Cantera crafted this talk for ScyllaDB Summit, a virtual conference for exploring what’s needed to power instantaneous experiences with massive distributed datasets. Register now (free + virtual) to join us live for ScyllaDB Summit 2023 featuring experts from Discord, Hulu, Strava, Epic Games, ScyllaDB and more, plus industry leaders on the latest in WebAssembly, Rust, NoSQL, SQL and event streaming trends. 

Supply Chain Digitization: Top Technical Challenges

Cantera began ins by introducing three of the most pressing technical challenges associated with supply chain digitization.

First, there are multiple actors and systems generating data and integrating data across the supply chain — and verifying the identity of each is critical. Suppliers, OEMs, food processors, brands, recycling agents, consumers, ports, carriers, ground transporters, inspectors/authorities, freight forwarders, customs, dealers, repairers, etc. are all involved, and all must be verified.

Second, there are multiple relationships across all these actors, and these relationships cross borders with no central anchor and no single source of truth. In addition to business-to-business and business-to-consumer, there are also business-to-government and government-to-government relationships.

Third, there are different functional needs related to maintaining trust between the different actors through verifiable data. Traceability is key here. It’s an enabler for compliance, product authenticity, transparency and provenance with a view to different kinds of applications. For example, traceability is essential for ethical sourcing, food safety and effective recalls.

Use Case 1: Cross-Border Trade

For his first example, Cantera turns to cross-border trade operations.

“This is a multilayered domain, and there are many different problems that have to be solved in different places,” he warns before sharing a diagram that reins in the enormous complexity of the situation:

The key flows here are:

  • Financial procedures: The pure financial transaction between the two parties
  • Trade procedures: Any kind of document related to a commercial transaction
  • Transportation procedures: All the details about transporting the goods
  • Regulator procedures: The many different documents that must be exchanged between importers and exporters, as well with the public authorities in the business-to-government relationships

So how is the IOTA Foundation working to optimize this complex and multilayered domain? Cantera explains, “We are allowing different actors, different government agencies and the private actors (traders) to share documents and to verify documents in one shot. Whenever a consignment moves between East Africa and Europe, all the trade certificates, all the documents can be verified in one shot by the different actors, and the authenticity and the provenance of the documents can be traced properly. And as a result, the agility of the trade processes is improved. It’s more efficient and more effective.”

All the actors in the flow visualized above are sharing the documents through the infrastructure provided by the IOTA distributed ledger using an architecture that’s detailed after the second use case below.

Use Case 2: End-to-End Supply Chain Traceability

In addition to tackling document sharing and verification for cross-border trade, there’s another challenge: tracing the provenance of the trade items. Cantera emphasizes that when we think about traceability, we need to think about the definition of traceability given by the United Nations: “The ability to identify and trace the history, distribution, location and application of products, parts and materials, to ensure the reliability of sustainability claims, in the areas of human rights, labor (including health and safety), the environment and anti-corruption.”

In principle, traceability implies the ability to follow history. In the case of trade items, this means knowing what has been happening with that particular trade item — not only its transportation, but also its origin. If one of the parties involved in the supply chain is making a claim about sustainability, safety, etc., the validity of that claim must be verifiable.

For example, consider a seemingly simple bag of potato chips. A farmer sells potatoes to a food processor, who turns the potatoes into a bag of potato chips. When growing the potatoes, the farmer used a fertilizer, which was produced by another manufacturer and contained raw materials from a different farmer. And when converting potatoes into potato chips, the food processor uses oils that stem from yet another source. And so on and so on. The history of all these things — the potatoes, the fertilizer, the oils, the bag containing the chips, and so on — needs to be known for traceability on that bag of potato chips.

All these details — from when the potatoes were harvested to the fertilizer used, where that fertilizer came from, and so forth — are all considered critical events. And each of these critical tracking events has key data elements that describe who, what, when, where, why and even how.

How IOTA Addressed the Top Technical Challenges

The IOTA Foundation applied several core technologies to address the top technical challenges across these use cases:

  • Data interoperability
  • Scalable data stores
  • Scalable, permissionless, feeless distributed ledger technology

Data Interoperability

In these and similar use cases, many different actors need to exchange data, so that requires a standard syntax, with reference vocabularies, for semantic interoperability. Plus, it all needs to be extensible to accommodate the specialized needs of different industries (for instance, the automotive industry and the seafood industry have distinctly different nuances). Some of the key technologies used here include W3C with JSON-LD, GS1 with EPCIS 2.0 and UN/CEFACT which provides edi3 reference data models. IOTA also used sectoral standards for data interoperability; for example DCSA (maritime transportation), MOBI (connected vehicles and IoT commerce) and the Global Dialogue on Seafood Traceability to name a few.

It’s worth noting that IOTA was deeply involved in the development of EPCIS 2.0, which is a vocabulary and data model (plus a JSON-based serialization format and accompanying REST APIs). It enables stakeholders to share transactional information regarding the movement and status of objects (physical or digital), identified by keys. Using this model, events are described as follows:

And that translates to JSON-LD in a format like this:

Scalable Data Stores with ScyllaDB NoSQL

Establishing a scalable data store for all the critical data associated with each supply chain event was another challenge. Cantera explained, “If we are tracking every single item in the supply chains, we need to store a lot of data, and this is a big data problem. And here, ScyllaDB provides many advantages. We can scale our data very easily. We can keep the data for a long period of time at a fine granularity level. Not only that, but we can also combine the best of the NoSQL and SQL worlds because we can have robust schemas for having robust data and trusted data.”

Cantera then continued to detail ScyllaDB’s role in this architecture, providing an example from the automotive supply chain. Consider an OEM with 10 million cars manufactured per year. Assume that:

  • Each car has 3,000 trackable parts.
  • Each part can have a lifetime of 10 years.
  • Each part can generate 10 business events.

This translates to around 300 billion active business events to store in ScyllaDB. Another example: Consider a maritime transportation operator that’s moving 50 million containers per year. Given 10 events per container and five years of operation, Cantera estimates about 2,500,000 active events here — just from the EPCIS 2.0 events repository. But there are also additional layers that require this level of data scalability.

He closes his discussion of this challenge with a look at the many applications for ScyllaDB across this initiative:

  • Events repository (EPCIS 2.0, DCSA, …)
  • Item-level tracking
  • Inventory
  • Catalog
  • Any DLT Layer 2 data storage

Scalable, Permissionless, Feeless Distributed Ledger Technology

Scalable, permissionless and feeless distributed ledger technology also played a key role in the solution that the IOTA Foundation architected. For this, it tapped the IOTA distributed ledger in combination with protected storages like IPFS to provide the functionalities around data and document verifiability, auditability and immutability within these peer-to-peer interactions.

For example, say you hire a particular transporter to move goods. When the activity begins, the transporter can generate an event that the trade items have started moving through the supply chain, and these events are committed to the IOTA distributed ledger. More specifically, the originator of the event generates a transaction on the distributed ledger, and that transaction can be later used by any participant in the supply chain to verify the authenticity of the event. And once the event is committed, the originator can no longer modify it. If the event was modified, the verification step would fail, and the supply chain partners might be understandably concerned.

Here’s how it all fits together:

Tip: For Cantera’s block-by-block tour of this reference architecture, see the video below, starting at 17:15.

Conclusions

Supply chain digitization is rife with technical challenges, so it’s not surprising that a nontraditional mix of technologies is required to meet the IOTA Foundation’s highly-specialized needs. Cantera sums it up nicely:

“It requires interoperability — which means it’s important to align with the open standards, EPCIS 2.0, the decentralized ID coming from W3C verifiable credentials. It requires a reference architecture to guarantee that semantic interoperability and some reusable building blocks are used. It requires decentralization, and decentralizing data requires distributed ledger technology — in particular, public, permissionless and feeless distributed layers like IOTA complemented with IPFS, and relying more and more on decentralized applications. It also requires data scalability and availability, and ScyllaDB is the perfect partner here. Last but not least, it requires trusted data sharing with technologies like decentralized IDs, distributed ledger technologies, and peer-to-peer.”

Group
Created with Sketch.

TNS owner Insight Partners is an investor in: Pragma.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


The Personalization API allows any system to fetch cloud warehouse data with sub-30 …

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Hightouch Unveils Personalization API, Combining the Analytical Power of the Data …

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Unlock the full potential of the cloud data warehouse with a highly performant API to deliver personalized experiences across the customer journey

SAN FRANCISCO, Jan. 25, 2023 /PRNewswire/ — Hightouch, the leading Data Activation platform, today announced the release of the Personalization API, a low-latency API designed to personalize any customer experience. The Personalization API allows any system—including internal tools and external SaaS applications—to fetch cloud warehouse data with sub-30 millisecond response time. The feature supports any data schema (e.g., users, companies, products) and integrates natively with all popular data warehouses and transactional SQL and NoSQL databases. Now, every business user has real-time access to customer data, wherever and whenever they need it.

The Personalization API allows any system to fetch cloud warehouse data with sub-30 millisecond response time, and integrates natively with all popular data warehouses and transactional SQL and NoSQL databases.The Personalization API allows any system to fetch cloud warehouse data with sub-30 millisecond response time, and integrates natively with all popular data warehouses and transactional SQL and NoSQL databases.

The Personalization API allows any system to fetch cloud warehouse data with sub-30 millisecond response time, and integrates natively with all popular data warehouses and transactional SQL and NoSQL databases.

Modern consumers want, if not expect, personalized experiences with every brand interaction. A recent Forrester study found that 68% of shoppers are unlikely to return to a website or store that doesn’t provide a satisfactory customer experience. However, as many as 63% of digital marketers still struggle to harness the potential of personalization technology, according to a recent Gartner study. Hightouch’s Personalization API addresses this pain point and makes it easy for any business team to use the modeled data that already exists in the warehouse and translate it into compelling customer experiences in real-time.

“Traditional CDPs fail to deliver personalization at most organizations because their treatment of data doesn’t allow for experimentation and agility, especially for complex businesses,” explains Austin Hay, Head of Marketing Technology at Ramp. “Hightouch’s Personalization API drastically simplifies personalization by leveraging the data warehouse– giving teams the ability to experiment quickly with more complex architectures.”

With Personalization API, businesses can address an array of new use cases, making it easier than ever to deliver tailored customer experiences, including:

  • Powering personalized marketing campaigns by enriching marketing touchpoints via customer engagement platforms with dynamic data points like product recommendations

  • Delivering in-app or web personalization like customized search results, article recommendations, or nearest store locations to drive conversions

  • Optimizing product experimentation by delivering up-to-date customer information, like audience inclusion or exclusion, to decisioning and experimentation systems

“There’s a massive shift towards the cloud data warehouse as the single source of truth across an organization,” said Tejas Manohar, Co-CEO and Co-Founder of Hightouch. “All teams need the infrastructure and tools to use this data, and we’re excited to expand our platform offering with this new feature to support even more of their use cases.”

Personalization API is generally available today to Hightouch Business Tier customers. To learn more, read the blog post or schedule a demo with the Hightouch team.

About Hightouch

Hightouch is the world’s leading Data Activation platform, syncing data from warehouses directly into your SaaS tools. All business teams, from sales and marketing to support and customer success, need relevant, accurate, and real-time customer data to add critical context in the software they already use. Whether you’re enhancing communications with customers via CRM, optimizing ad copy, or personalizing email, Hightouch makes your data actionable. For more information, visit www.hightouch.com.

(PRNewsfoto/Hightouch)(PRNewsfoto/Hightouch)

(PRNewsfoto/Hightouch)

CisionCision

Cision

View original content to download multimedia:https://www.prnewswire.com/news-releases/hightouch-unveils-personalization-api-combining-the-analytical-power-of-the-data-warehouse-with-a-real-time-api-301730610.html

SOURCE Hightouch

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Avoid Being an “Ivory Tower” Architect: The Relationship between Architects and Their Organisation

MMS Founder
MMS Eran Stiller

Article originally posted on InfoQ. Visit InfoQ

In a recently published episode of Armchair Architects, the speakers discussed the relationship between software architects and the rest of the organisation. They detail how a successful architect can impact others by switching between going into the trenches and zooming into a tree and then being able to zoom out and estimate if that tree still fits into the forest.

Uli Homann, Corporate Vice President at Microsoft, states the following:

Sometimes architects are considered Ivory Tower Architects because they’re not involved in the real trenches. You don’t understand what the pressures are, what the realities are, and you’re telling me to use this technology and do it this way without having a detailed understanding of what the implications are.

I think the goal of a higher-level architect is to drive the direction of many efforts. When you are in the trenches, you always look at the tree and don’t really look at the forest anymore. So you need this balance between a very detailed understanding but also the ability to zoom back out and say, wait a second, are we still on the right path or are we going left where everybody else is going right?

The only way you can avoid getting into this complexity of people not liking you or believing that you are just talking rather than doing is – “do”. You have to be part of the conversation, and the trick is to still be able to go back and forth. You understand the tree, then go back and make sure that the tree still fits into the forest and update the enterprise architecture strategy based on the learning you learned in the trenches.

Homann explains that this disconnect happens when architects say something and that something actually makes sense, but then reality hits, and it doesn’t quite make sense anymore. If the feedback loop doesn’t happen, the architecture doesn’t get updated with real-life feedback and drifts apart from reality. “It’s OK to give a direction and strategy, but then go deep with the team that has to live with the decision and learn from their detailed conversation if the stuff you’re trying to build actually works.”

Eric Charran, Chief Architect at Microsoft, explains how he believes software architects should be part-time civil servants and part-time community organisers. As a civil servant, the architect’s goal is to help the team achieve its goal, including getting its hands dirty. “How can I help?” is a crucial question, as well as “Here are some tools and techniques that would help”. As community organisers, architects should take what they learned and spread that knowledge to the rest of the organisation, crediting the teams for that knowledge appropriately. He says, “as an architect, I’m successful when teams who stand on my shoulders do real stuff.”

When the host, David Blank-Edelman, Senior Cloud Advocate at Microsoft, asks how to get people to listen to you, Charran replies that people want to do a good job and will listen if they see you can help them. He also comments that people don’t make factual decisions. They make emotional decisions and look for facts to back them up. “You have to be willing to invest the time to help them get to a comfort state where they can listen to your points. If they’re not willing to listen to your points, you might be 100% right, but that’s only 50% of the battle.”

Charran states that if architects explain the same thing repeatedly to the same people, they should only use their organisational authority and become the “friendly ball thrower” as a last resort. Homann adds that architects should always strive to support their advice with external evidence and finishes by stating that if architects fail to get to people by themselves, they can also try to reach them via others who can.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


HC-tree is a High-Concurrency Backend for SQLite Supporting Replication

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

HC-tree is a project aimed to build a new backend for SQLite specifically optimized for high-concurrency and lead-follower style replication. While still experimental, HC-tree can be used as an SQLite drop-in replacement, albeit with limitations.

SQLite is sometimes used as the core of a client/server database system. While it works reliably well in such cases, the database backend module that it uses to store b-tree structures in its database file was not designed with this case in mind and can be improved upon in several ways.

In particular, SQLite does not fully support the possibility of multiple simultaneous writers. Even when using its begin-concurrent extension, which uses page-level locking to allow multiple writers, SQLite may incur in locking conflicts and thus needs to serialize all commit operations.

HC-tree, on the contrary, is designed to support dozens of concurrent writers, say the team behind it, thanks to its optimistic row-level locking.

HC-tree offers similar MVCC based optimistic concurrency to SQLite with the begin-concurrent extension, except that it validates transactions based on the keys and logical ranges of keys accessed instead of the set of pages.

The goal of the project is being at least as fast as SQLite for the single-threaded case, so that running concurrent writes can really bring a performance advantage. As the official benchmarks show, HC-tree easily outperforms stock SQLite in a number of different concurrent scenarios and is on a par with respect to single-writer cases.

Another goal of HC-tree is improving SQLite support for replication beyond what is permitted by the standard sessions extension, which enables serializing the content of a committed transaction to send it to a different database. To this aim, HC-tree will promote the sessions extension to be part of the backend and add support to manage leader-follower transactions, meaning that changes can be applied more quickly from the leader database to followers because no validation is required.

As a last note regarding improved features, HC-tree aims to replace 32-bit page numbers with 48-bit page numbers in future to go beyond SQLite limit of 16TiB.

HC-tree is still a work in progress and while it can be used for experimentation and assessing whether it is a good replacement for an existing SQLite-based solution, it still has a number of limitations, including missing support for BEGIN EXCLUSIVE, reduced efficiency for transactions that do not fit entirely in main memory, and others that you can check out on the project’s official page.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Couchbase Adds Azure Support To Capella – I Programmer

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Couchbase has announced that Capella, the fully managed service version of its distributed NoSQL database that includes mobile and IoT application services, now supports Microsoft Azure. 

Couchbase is a distributed NoSQL cloud database that combines the properties of a distributed document database (JSON) with features of a traditional DBMS including distributed SQL, transactions, and ACID guarantees.

couchbase

Capella was launced at the end of 2021 with support for AWS, and since then support has been added for GCP and now Azure. Customers using Capella on Google Cloud pay a fee to Google with one Capella credit costing $1. The new Azure service will be billed by Couchbase, though no further details of the pay per consumption model were given.

As well as the choice of AWS, GCP and Azure, Capella customers can also run hybrid models of self-managed clusters that sync data to Capella.App Services, which Couchbase describes as a fully managed backend designed for mobile, IoT, and edge apps. From a developer perspective, App Services can be used to access and sync data between Capella and edge devices, as well as to authenticate and manage app users.

Capella also comes with a better developer UI than the original launch version. The evolved UI has been changed to make key tools, workflows, and information easier for developers to find and use, with a layout inspired by GitHub and a Quick Start dashboard with links to the Query Workbench, Import and App Services tools, and to SDKs.

There’s also a new storage engine, Magma, that is designed to be highly performant even with very large datasets that do not fit in memory. Couchbase says that Magma really shines when used for datasets that will not fit into available memory and that require maximum data compression.

Capella with support for Azure is available now. 

couchbase

More Information

Couchbase Website

Related Articles

Couchbase 7 Adds Relational Support Model

Insights From Couchbase Connect

Couchbase Connect Goes Online

Couchbase Launches JSON Analytics

Couchbase Mobile Version Adds Sync Gateway

Couchbase Server 4.0 Now With N1QL

What Happens To Couchbase Now?

Couchbase Server 2.0

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner

picobook

Comments

or email your comment to: comments@i-programmer.info

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


AI Developers Release Open-Source Implementations of ChatGPT Training Algorithm

MMS Founder
MMS Anthony Alford

Article originally posted on InfoQ. Visit InfoQ

AI research groups LAION and CarperAI have released OpenAssistant and trlX, open-source implementations of reinforcement learning from human feedback (RLHF), the algorithm used to train ChatGPT. Independent AI developer Phil Wang has also open-sourced his own implementation of the algorithm.

LAION, the Large-scale Artificial Intelligence Open Network, is a non-profit machine learning research organization dedicated to making AI models, datasets, and code available to the public. In 2022, InfoQ covered LAION’s release of LAION-5B, an AI training dataset containing over five billion image-text pairs. LAION’s latest project is OpenAssistant, which is intended to “give everyone access to a great chat based large language model.” The planned MVP implementation of OpenAssistant will be based on OpenAI’s InstructGPT paper: a dataset of human-generated instructions, a dataset of machine-generated responses and their human rankings, and an implementation of RLHF. According to LAION:

We are not going to stop at replicating ChatGPT. We want to build the assistant of the future, able to not only write email and cover letters, but do meaningful work, use APIs, dynamically research information, and much more, with the ability to be personalized and extended by anyone. And we want to do this in a way that is open and accessible, which means we must not only build a great assistant, but also make it small and efficient enough to run on consumer hardware.

CarperAI is a new lab within the EleutherAI research group, tasked with “improving the performance and safety of large language models (LLMs) with reinforcement learning.” InfoQ previously covered EleutherAI’s development of open-source language model GPT-NeoX. In October 2022, the lab announced a project to train and publicly release “instruction-tuned” models using RLHF. The project is a cooperative effort of several organizations, including HuggingFace, Scale, and Humanloop. As part of this project, CarperAI open-sourced Transformer Reinforcement Learning X (trlX), a framework for fine-tuning HuggingFace language models using RLHF.

Phil Wang, an AI developer known for open-source implementations of deep learning research models such as Imagen and Make-A-Video, shared his work-in-progress implementation of RLHF for the PaLM language model called PaLM + RLHF. Wang notes that there is no pre-trained model, only a framework for users to train their own. He also recommends users interested in replicating ChatGPT should join the LAION discord channel.

Although these open-source projects include implementations of ChatGPT’s training methods, they do not have any trained models currently available. Wang’s project FAQ suggests that training might require “millions of dollars of compute + data” to complete. LAION’s roadmap document for OpenAssistant does list efforts to collect data and train models, but isn’t clear on when trained models might be released. CarperAI’s Twitter account noted:

We haven’t released any RLHF models yet officially, just a few small replication efforts of hh-RLHF, learning to summarize, etc in our discord. We can match performance reported in respective papers on these.

Several prominent members of the AI community have discussed these efforts on social media. On Twitter, HuggingFace CTO Julien Chaumond predicted that in six months there will be “10 open reproductions of ChatGPT.” AI researcher Sebastian Raschka replied:

Agreed, there will be many open source implementations of ChatGPT. But there won’t be many high-quality models. I think we underestimate how much people hate labeling (or worse: writing) training data by hand.

StabilityAI’s founder Emad Mostaque tweeted that his company is “working on open chatGPT.” He also said:

Toughest part of open chatGPT creation (aside from millions of bucks for RL bit) is the governance aspect…The nice thing is once all the blood sweat and tears go into creating the models and frameworks they can proliferate like crazy as a new type of dev primitive.

The source code for OpenAssistant, trlX, and PaLM + RLHF are all available on GitHub.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Log Analytics Feature in Cloud Logging Is Now Generally Available

MMS Founder
MMS Steef-Jan Wiggers

Article originally posted on InfoQ. Visit InfoQ

Google recently made its Cloud Logging Log Analytics feature generally available (GA), allowing users to search, aggregate, and transform all log data types, including application, network, and audit logs.

Last year the company released the Log Analytics feature in preview as a feature within the Cloud logging service. With the GA release, Google includes three new capabilities:

  • Multi-region support for US and EU regions, allowing users to store and analyze their logs in the most convenient region, improving performance and reducing latency.
  • Improved query experience to save and share queries, allowing users to reuse and share their most important queries.
  • Support for custom retention for up to 10 years, where custom log retention pricing applies.

 
Source: https://cloud.google.com/blog/products/devops-sre/log-analytics-in-cloud-logging-is-now-ga/

To leverage log analytics, users can create a log bucket and upgrade it to use Log Analytics using the Google Cloud console or the Google Cloud CLI (with the ‘gcloud logging’ command). Then, they can use SQL to query logs stored in their log bucket. In addition, to use BigQuery to query the data, users must create a linked dataset

Use cases for log analytics in Google Cloud logging can be, for instance, for debugging and troubleshooting purposes by querying log data to identify and diagnose issues with applications and infrastructure. Furthermore, log data can be used to monitor the performance of applications and identify bottlenecks or, when security related, queried to detect security breaches and suspicious activity.

Yuki Ito, a Software Architect, and Google Cloud Champion Innovator, tweeted:

> Log Analytics in Cloud Logging is now GA
I just started to use Log Analytics to generate insights from API access logs

Google’s competitor in the public cloud space, Microsoft, has a similar service in Azure also called Log Analytics (part of Azure Monitor). It is a tool in the Azure portal allowing users to edit and run log queries against data in the Azure Monitor Logs store. The queries are performed with a proprietary Kusto query language (KQL). In addition, AWS also offers log analytics capabilities through Amazon CloudWatch.

Lastly, the pricing of Log Analytics is included in the standard Cloud Logging pricing, and more details on the Cloud Logging service are available on the documentation landing page.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


DataStax helps orgs tackle real-time data management at scale – SiliconANGLE

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

It’s now widely accepted that all companies are data companies, regardless of the industry. While data is growing in value, it’s increasingly difficult to store large amounts of data securely, sparking a need for services to handle the exponentially increasing data.

California-based company DataStax Inc. helps organizations scale their data needs with its cloud-native NoSQL database built on Apache Cassandra.

“We’re taking [Apache Cassandra], combining it with other technologies, such as Apache Pulsar for streaming, to provide a real-time data cloud, which helps our customers build applications faster and help them scale without limits,” said Thomas Been (pictured), chief marketing officer of DataStax.

Been spoke with theCUBE industry analysts Lisa Martin and Dave Vellante at the recent AWS re:Invent conference, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed real-time data and how DataStax uses Apache Cassandra, technologies such as machine learning, and more. (* Disclosure below.)

The rise of real-time data

Real-time data has been around for a while but was previously an extremely niche service, Been explained. DataStax adapted to democratize data, making it easier for users to access all on serverless, cloud-native solutions.

“Home Depot, as an example, was able to deliver curbside pickup delivery in 30 days because they were already using DataStax and could adapt their business model with a real-time application,” Been said. “We also see a real strong trend with customer experiences, and increasingly a lot of tech companies are coming to us because scale means success to them.”

DataStax chose Apache Cassandra for many reasons, including its ability to process millions of data points with no single point of failure.

“It used to be a database that was a bit hard to manage and develop with, and that is why, as part of the cloud, we wanted to change these aspects, provide developers the API they like and need,” Been said. “This makes it super simple to operate and super affordable to run.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of AWS re:Invent:

(* Disclosure: DataStax Inc. sponsored this segment of theCUBE. Neither DataStax nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Cloud Databases Are Maturing Rapidly, Gartner Says – Datanami

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

(Blackboard/Shutterstock)

The mad dash to the cloud continues, but now we’re entering a maturation phase that’s bringing more advanced capabilities for both analytical and transactional workloads in the cloud, Gartner analysts wrote in their latest report on cloud database management systems.

In their latest Magic Quadrant for Cloud Database Management Systems, Gartner analysts Henry Cook, Merv Adrian, Rick Greenwald, and Xingyu Gu sliced and diced the market for cloud database offerings, which accounts for more than half of the total database market. Twenty vendors made the final cut, which includes a mix of relational and non-relational databases, document stores, and data warehouses.

While there are winners and losers in the database horse race (a result of Gartner’s new “momentum index”), the ultimate winners are the database customers, who are gaining access to broad array of ever more-powerful capabilities for storing and processing data.

“There has been a general lifting of capabilities across the market,” the Gartner analysts wrote. “Whereas in the last two to three years, the emphasis has been moving to the cloud, often with basic capability, this last year has seen a marked maturing of the majority of offerings.”

For example, certain capabilities in analytical databases, such as built-in machine learning, low-code development, or multi-modality, would have earned the database an “advanced” label in previous years, they said, “yet now they are standard parts of many offerings.”

Similarly, a database customer may have been duly impressed if her transactional databases could process distributed transactions across many processors and geographic areas or could leverage hyperscale architectures. But now those capabilities are “normal,” the analysts wrote.

Magic Quadrant for Cloud Database Management Systems, December 2022 (Image courtesy Gartner)

“For both types of systems, elastic scalability, SQL support (including for nonrelational databases) and the ability to mix operational and analytical working is becoming business as usual,” the analysts wrote. “Thus, all enterprises now have easier access to a wide range of database capabilities.”

Because cloud database capabilities are improving so quickly, Gartner warned customers not to pick a database based solely on a single feature. Most features are “copyable,” the analysts said, and if it’s valuable, it will be quickly copied by competitors.

So, how should prospective database customers pick a database? The number one metric is price performance, the analysts write. In particular, it’s the “price” side of that equation that’s really driving the needle at this point, they wrote.

“A key consideration is financial governance–the ability to predict, monitor and control costs,” the analysts wrote. “This is a common problem across all cloud-based systems where customers pay according to consumption rather than by an upfront investment.”

The cloud giants fared particularly well in Gartner’s Magic Quadrant, with AWS holding the most prominent spot, thanks to a broad array of database offerings and huge installed base. Microsoft Azure and Google Cloud also were prominently placed, along with relational database giant Oracle.

Database old guards IBM and SAP, both of which have considerable installations of on-prem relational database customers whom they are trying to move to the cloud, were also featured in the leader’s quadrant. Analytic database rivals Databricks and Snowflake also found themselves in the leader’s quadrant, along with Alibaba Cloud, NoSQL database king MongoDB, resurgent analytic database maker Teradata, and reformed Hadoop distributor Cloudera.

OEM database giant InterSystems, which is trying to become more visible, occupied the visionaries quadrant along with NoSQL vendor MarkLogic, which is being acquired by Progress. An open source multi-modal database, Redis, is the lone entry in the challenger’s quadrant.

Meanwhile, the niche players quadrant features two graph databases, Neo4j and TigerGraph, along with scale-out relational player Cockroach Labs and Couchbase, which sells a non-relational document store. Tencent Cloud also is listed in the niche players quadrant, all of whom were new entries to this Magic Quadrant except for Tencent Cloud. Gartner dropped four databases from last year’s report due to the market momentum index, including Exasol, Huawei, MariaDB, and SingleStore.

Data meshes and data fabrics figure to play prominently in database’s futures, Gartner says (Oleksii Lishchyshyn/Shutterstock)

The analysts made several other observations about the cloud database market, which accounted for the lion’s share the growth of the overall database market, valued at $80 billion. Momentum in open source databases continues to increase, they said, and there is also the trend of front-ending a database with a PostgreSQL or MySQL API, which have become industry standards.

We’re also seeing the rise of “data ecosystems,” the Gartner analysts said, which combines a specific database capability along with other integrated services. Customers can sign up for a data ecosystem from a single vendor, or build their own data ecosystem from multiple vendors.

Data meshes and data fabrics are also catching the eyes of analysts, who see greater participation and inclusion of databases in the years to come. The two architectural patterns will “entail a much greater degree of data management automation, metadata handling and interfacing, and incorporation of AI and machine learning into the business of data management itself,” the analysts write. “Although there are some very early indications of this in current offerings, this next wave has not yet begun but will likely begin another phase of disruption from 2023 onwards.”

Cloudera is sharing the Magic Quadrant for Cloud Database Management Systems. You can access it here.

Related Items:

Are Databases Becoming Just Query Engines for Big Object Stores?

Fivetran Benchmarks Five Cloud Data Warehouses

Cloud Now Default Platform for Databases, Gartner Says

 


Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.