Mobile Monitoring Solutions

Search
Close this search box.

Data Fetching Patterns for a Better User Experience – Joe Savona at React Conf

MMS Founder
MMS Bruno Couriol

Article originally posted on InfoQ. Visit InfoQ

Joe Savona explored at React Conf some of the ways Relay and Suspense can help improve the user loading experience and the best practices that have been identified in production for using Suspense for data-fetching.

Savina started by recalling the problem that Suspense aims at solving. Applications often need to fetch a lot of data to show to users. In a React context, the application is implemented with components, and each component may fetch its required data when it mounts. Meanwhile, a loading indicator may be displayed. This simple pattern, dubbed in the presentation as fetch on render results in several roundtrips to the server and a waterfall of loading indicators. Relay was introduced 5 years ago to alleviate some of these problems.

Loading experience made of an increasing number of loading indicators

React developers colocate in the component code the description of the data that they need:

function Post(props) {
  const data = useFragment(graphql`
    fragment Post_fragment on Post {
      title
      body
    }
`  , props.post);

  return (
  <Title>{data.title}</Title>
  <Body body={data.body}> />
  );
}

Relay aggregates the data dependencies across the whole application’s component tree, and may fetch it in a single roundtrip to the server. This fetch on render pattern avoids the previously shown spinner waterfall.

However, waiting for the whole data to arrive to display the whole page may still result in sub-optimal loading experience. Incrementally rendering components when they are ready may provide better user experience, in particular assuming loading indicators are replaced with placeholders that occupy the full space assigned in the layout to the component. Taking the new facebook.com as an example, Savona explained that the facebook.com’s header would display first, followed by the left navigation, followed by the stories, following by the posts and feeds. If it would occur that the news feeds is ready to display first (having received its data), it would be better to wait for and display the left navigation first, and then the newsfeeds. Otherwise, the user may suffer from brief and unpleasing layout changes.

The loading of new pages (i.e. transitions) in single-page applications may also be sped up by downloading the required data and code for the next page in parallel and having a carefully prepared transition between pages that minimize the perceived waiting time. The pattern used on facebook.com is dubbed render as you fetch.

In this context, Savona went on to describe how the pattern is implemented on Facebook’s new website. As mentioned before, Components declaratively describe their data requirement by means of a graphQL query colocated in the component code. Components that need the fetching of remote resources to render are wrapped in the <Suspense fallback={<PlaceHolder/>}>/> component. React then displays a fallback while fetching the necessary resource. In order to coordinate the rendering of pending components, the <SuspenseList/> component is used:

function Home(props) {
  return (
    <ErrorBoundary fallback={<ErrorMessage/>}>
      <SuspenseList revealOrder="forwards">
        <Suspense fallback={<ComposerFallback/>}>
          <Composer />
        </Suspense>
        <Suspense fallback={<FeedFallback/>}>
          <NewsFeed/>
        </Suspense>
      </SuspenseList>

    </ErrorBoundary>
  )
}

With the previous code, the newsfeed and composer will render their placeholder first. The composer will render before the newsfeed independently of the order in which the fetched resources arrive:

Ordered loading experience resulting from using the SuspenseList component

Additionally, the <Suspense /> component will optimize further the perceived loading performance by waiting a short amount of time when the first component is ready to render. If within that time the second component is ready to render, then both components are rendered together. React’s user research has shown the positive impact of that pattern:

it turns out that users’ perception of performance is determined by more than the absolute loading time. For example, when comparing two apps with the same absolute startup time, our research shows that users will generally perceive the one with fewer intermediate loading states and fewer layout changes as having loaded faster.

Developers should also note the <ErrorBoundary/> component which handles the case of erroneous resource fetching.

Additionally, in handlers of events triggering page transitions, Facebook developers use a specific API by which the resources (code and data) for the next page are identified. The fetching of such resources thus starts in the handler, in parallel, before any rendering of the next page.

Savona’s full talk is available on ReactConf’s site and contains further code snippets, animated demos, and detailed explanations. React Conf is the official Facebook React event. React Conf was held in 2019 in Henderson, Nevada, on October 24 & 25.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Blender, Facebook State-of-the-Art Human-like Chatbot, Now Open Source

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

Blender is an open-domain chatbot developed at Facebook AI Research (FAIR), Facebook’s AI and machine learning division. According to FAIR, it is the first chatbot that has learned to blend several conversation skills, including the ability to show empathy and discuss nearly any topics, beating Google’s chatbot in tests with human evaluators.

Some of the best current systems have made progress by training high-capacity neural models with millions or billions of parameters using huge text corpora sourced from the web. Our new recipe incorporates not just large-scale neural models, with up to 9.4 billion parameters — or 3.6x more than the largest existing system — but also equally important techniques for blending skills and detailed generation.

Blender was trained using previously available public domain conversations including 1.5 billion conversation examples. The resulting neural network was too large to fit on a single device, so Facebook engineers split it into smaller pieces to make it scale to even larger datasets.

Skill blending is, as mentioned, one of Blender key features.

Rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them all into one cohesive conversational flow.

Based on a new dataset called Blended Skill Talk (BST), Blender’s skill blending capabilities include displaying a consistent personality to improve naturalness of dialogues, using knowledge to conduct discussions on an open range of topics, and displaying empathy.

Blending these skills is a difficult challenge because systems must be able to switch between different tasks when appropriate, like adjusting tone if a person changes from joking to serious.

Another distinctive trait of Blender is the approach to generation strategies. Generation strategies are used ensure chatbots do not repeat themselves, provide too lengthy or shallow responses, or show displaying other shortcomings. Facebook engineers opted for a careful choice of search hyperparameters over sampling to reach an optimal balance of how lively or dull a conversation is.

Facebook engineers evaluated Blender by letting human evaluators compare its performance to Google’s latest Meena chatbot. The test was conducted comparing Blender’s and Meena’s chat logs.

When presented with chats showing Meena in action and chats showing Blender in action, 67 percent of the evaluators said that our model sounds more human, and 75 percent said that they would rather have a long conversation with Blender than with Meena.

According to Facebook, Blender’s edge over Meena can be explained based on Blender’s skill blending and generation strategies. Strikingly, human evaluators preferred a conversation with Blender over a conversation with humans 49% of the time, while this figure decreases to 36% when using models unable to blending skills.

Evolution of human-like chatbots does not end with Blender, which still displays a number of shortcomings, like contradicting or repeating itself, or “hallucinate” knowledge, i.e. make up facts.

We’re currently exploring ways to further improve the conversational quality of our models in longer conversations with new architectures and different loss functions. We’re also focused on building stronger classifiers to filter out harmful language in dialogues. And we’ve seen preliminary success in studies to help mitigate gender bias in chatbots.

Major areas of research for future development are mitigating gender bias, filtering out harmful language, and others. Facebook hopes that Blender can help the AI research community to further advance the state of the art of conversational chatbots.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Presentation: You Can AI Like an Expert

MMS Founder
MMS Jon McLoone

Article originally posted on InfoQ. Visit InfoQ

Jon McLoone shows that symbolic representation also helps in automating the transition from research experiments to the production deployment of AI services.

By Jon McLoone

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Splunk Launches New Release of SignalFx APM

MMS Founder
MMS Helen Beal

Article originally posted on InfoQ. Visit InfoQ

Splunk, a platform for searching, monitoring, and examining machine-generated big data, has launched a new release of application monitoring tool SignalFx Microservices APM™. The new release combines NoSample™ tracing, open standards based instrumentation and artificial intelligence (AI)-driven directed troubleshooting from SignalFx and Omnition into a single solution. SignalFx Microservices APM supports lightweight, open source and open standards-based instrumentation with the goal of flexible data collection designed for modern cloud environments.

Splunk also further expanded its observability offerings with a major feature release in SignalFx Infrastructure Monitoring for containerised data: Kubernetes Navigator. Kubernetes Navigator uses AI-driven analytics to surface recommendations intended to expedite triaging and troubleshooting. Workflow integration between Kubernetes Navigator and Splunk Enterprise or Splunk Cloud aims to reduce context switching and provide insights with the goal of accelerated root-cause analysis.

InfoQ asked Karthik Rau, area general manager for application management, Splunk to answer some questions relating to the new release:

InfoQ: How do teams use Splunk solutions, including SignalFx, to obtain a complete picture in hybrid environments that include legacy, heritage or cherished applications and platforms, such as SAP or mainframe alongside public/private cloud and microservices-based products?

Rau: When we look at the market, we can easily identify a trend to move workloads from private, to hybrid to public clouds; a journey to become cloud-native. We realised that there was a gap in observability for microservices-based applications, for which traditional methods of application monitoring don’t work because applications are no longer constructed as monoliths. Because of the ephemeral nature of cloud infrastructure, complex interdependencies of hundreds, sometimes thousands, of microservices, and DevOps teams release code multiple times per day, problems occur much more frequently and are much harder to troubleshoot and resolve. This new complexity frequently results in customer-impacting service outages, slowdowns and errors.  To solve these problems, we took a different approach with SignalFx Microservices APM; collecting and analysing 100% of their data. This is beneficial for IT and DevOps since it means that no issue goes undetected. Once the data is collected, SignalFx uses a combination of AI and ML to connect the dots and drive relevant information to the surface, with the goal of allowing developers to spend less time searching for the source of problems and more time resolving them.

InfoQ: How can Splunk help teams gain visibility into the 4 key DevOps metrics i.e. deployment frequency, lead time (code commit to deploy in production), MTTR and change fail rate?

Rau: There are two important aspects to this: Firstly, the application delivery pipeline and secondly, as part of that lifecycle, production monitoring. Splunk Enterprise and Splunk Cloud already provide application lifecycle analytics, which provides visibility into the end-to-end development process, connecting tools across the entire development toolchain and providing visibility into the code quality and DevOps metrics. With the addition of SignalFx Microservices APM, we now provide DevOps teams with the industry’s most powerful production monitoring and troubleshooting solution for any on-premise, hybrid, or cloud application. One of the unique capabilities of SignalFx Microservices APM is the ability to collect 100% of traces, meaning that DevOps teams can, with full fidelity and extremely high levels of granularity, understand the exact behaviour of their software and accelerate deployment frequency. Combined with our streaming analytics engine, our customers can see the impact of such releases in seconds, thereby minimising Mean Time to Detect (MTTD), and act immediately. With our unique AI-Driven Directed Troubleshooting, that combs through all the traces data and automatically surfaces recommendations, DevOps teams can quickly pinpoint and resolve  the root-cause of an issue significantly reducing MTTR and helping developers. Finally, we also have the ability to automate responses via our monitoring-as-code approach. We can enable DevOps teams to deploy multiple versions of code or canary releases, track the impact of each and every release, and do a roll-back if there’s a problem, with the intention of reducing change failure rate and fixing problems before they impact end users.

InfoQ: How does Splunk help teams manage flow in a value stream?

Rau: Splunk helps manage the end-to-end application (DevOps) lifecycle by monitoring the delivery pipeline and production environment, as mentioned above. In addition, with our incident response and automation capabilities, we provide open- and closed-loop capabilities for the supporting practices, especially for incident management, service level management and knowledge management. Production application monitoring is often the weakest link in the value stream, with legacy APM solutions providing limited visibility into what is actually happening with applications and end user experiences with those applications. With SignalFx Microservices APM, DevOps teams are able to correlate, understand, and quickly act on mountains of trace data to deeply understand the behaviour of their applications, instantly detect problems, and quickly resolve issues before users are affected. The level of observability that Splunk now offers means that developers spend less time troubleshooting and more time coding.

InfoQ: Can Splunk help teams calculate the value realised from a new feature and if so, how?

Rau: With SignalFx, we support custom business metrics that tie directly back to the production application so DevOps teams and business stakeholders can see how code changes can positively (or negatively) impact application uptime and user experience, and correlate that to, for example in an e-commerce application, units of good sold, in real time. This ability to track relevant business data, correlate it to application performance, and do so in real time is increasingly important for any digital initiative, especially those built around always-on online experiences. Splunk recently released a survey in conjunction with ESG that quantified the economic impact of leveraging data. The survey found that on average, companies reported a bottom-line improvement over the past 12 months of $27.6M (or a 9.1% gain in net income) directly attributable to operationalising data.

InfoQ: What are the observability challenges that microservices architectures cause and how does Splunk solve them?

Rau: Microservices have a lot of advantages in terms of scaling, time to market, among others, but they also introduce their own challenges and high degrees of complexity – the infrastructure on which they run is typically ephemeral, spinning up and spinning down very quickly, services and individual instances of services scale fast and, as their numbers multiply, the interactions between them multiply even faster, causing the amount of data to skyrocket and creating very complex interdependencies. You often have multiple versions of the same microservice running at the same time, and these versions are released sometimes several times a day. Finally, DevOps teams try to find the optimal tools and frameworks for each microservice, and as a result rely heavily on open source and open standards. In such environments, traditional APM tools miss issues because their approach to handle large amounts of data is based on sampling, manual needle in a haystack troubleshooting, they’re slow, siloed, and lock customers in with proprietary agents. On the other hand, SignalFx Microservices APM was designed specifically for microservices. We solve the challenges they introduce by ingesting and analyzing all the data, using advanced AI and streaming analytics to get insights within seconds, as well as leveraging and contributing to open standards such as OpenTelemetry, which we co-founded.

InfoQ: What are some examples of insights that Kubernetes Navigator provides?

Rau: Kubernetes Navigator provides visibility into Kubernetes environments of all sizes. With Kubernetes Navigator, DevOps teams are able to detect, triage and resolve performance issues by navigating the complexity associated with operating Kubernetes at scale. Kubernetes Navigator helps DevOps teams expedite troubleshooting and provides them with ways to instantly understand the health of Kubernetes clusters. To understand the ‘why’ behind performance anomalies, Kubernetes Navigator uses AI-driven analytics, which automatically surface insights and recommendations to answer what is causing anomalies across the entire Kubernetes cluster; nodes, pods, containers, and workloads. One such example is a noisy neighbour problem. Application workloads run on containers that are dynamically managed by Kubernetes across shared infrastructure resources. A noisy neighbour, which could be caused by a simple misconfiguration on a memory limit, could increase the memory consumption on a particular node, impacting the rest of the containers, and application workloads, on that node. This might result in end users experiencing slow performance or errors as they interact with the application. Without Kubernetes Navigator, DevOps teams would spend significant time examining individual nodes, pods or workloads. Kubernetes Navigator makes suggestions on what specific pod or workload might be causing the anomalies, with the goal of reducing triaging and troubleshooting time. A unified, correlated view across services and infrastructure can enable DevOps teams to swiftly identify what specific instance of a service is being impacted.

InfoQ: Where is the line between infrastructure and application in a product-centric, cloud and microservices world?

Rau: In order to survive and thrive in today’s increasingly product-centric world, an equal focus must be put on infrastructure and application. End user interactions are at the core of every business today, and their experiences are fragile. End users that have to wait too long for an application to load do not care whether the root cause is in the infrastructure or in the application. That’s why having a unified, full stack view of both your applications and your infrastructure, and being able to correlate the two is extremely important, and can have a direct impact on revenue, and ultimately, overall brand loyalty. Another consideration is the evolution of cloud infrastructure in the sense that it is becoming much more software-defined and ephemeral. Developers no longer need to rely on IT teams to rack and stack servers in a data centre. They can simply go to any cloud provider and, with a few simple clicks of the mouse button, provision any amount of infrastructure resources they need in a matter of minutes. They can also use serverless functions, which abstract away infrastructure altogether. This evolution of infrastructure has been critical to accelerating innovation and the delivery of software.

InfoQ: How does Splunk integrate with ChatOps and service desk or incident management solutions such as ServiceNow, Jira Service Desk or Cherwell?

Rau: Splunk’s VictorOps incident response system integrates deeply with service desks like ServiceNow as well as chat-oriented tools like Slack. Incident tickets in ServiceNow are correlated with incidents in VictorOps, and all updates and closures of tickets are synchronised between ServiceNow and VictorOps. Similarly, VictorOps integrates with Slack. When an incident is opened, a Slack channel is opened, and chat that occurs in that channel is synchronised between Slack and VictorOps. You can even use Slack commands to escalate, snooze and close events. Combined, VictorOps can synchronise across ServiceNow and Slack so operations teams and developers can chat in their preferred tool, but VictorOps is logging everything. Teams can also curate interactions between people for post incident review reporting.

InfoQ: What does being a gold member of Cloud Native Computing Foundation (CNCF) mean?

Rau: We became a gold member to demonstrate our commitment to open source and deepen our relation with the DevOps community. While Splunk has been actively involved in open source for many years with offerings and contributions to numerous projects, this commitment has accelerated with the acquisitions of SignalFx, Omnition – a founding contributor to the OpenTelemetry project, and others. Our own CNCF contributions have included projects like Cortex and Prometheus, Envoy, Fluentd and others, both as maintainers and contributors. More recently, our team is focused on bringing the OpenTelemetry project to fruition to provide developers with the most flexibility in collecting data from their applications while avoiding proprietary, heavy-weight and performance-impacting agents.

To learn more about the CNCF’s projects, review the CNCF Cloud Native Interactive Landscape.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


CircleCI Releases API Version 2 with Improved Insights Endpoints

MMS Founder
MMS Matt Campbell

Article originally posted on InfoQ. Visit InfoQ

CircleCI has improved the stability of their insights endpoints in the version 2 release of their API. The insights endpoints allow for tracking the status of jobs and workflows, monitor the duration of jobs, and investigate opportunities for optimizing resource consumption.

With this announcement, four new endpoints have been added to the API. It is now possible to view aggregate data at the job or workflow level. This includes data such as the number of successful and failed runs, the throughput (average number of runs/day), the mean time to recover, and the total credits used. For the duration statistics the data is returned with a number of statistics including max, min, mean, median, p95, and standard deviation. To retrieve aggregated data for a particular workflow, the following endpoint can be used:

GET https://circleci.com/api/v2/insights/{project-slug}/workflows

It is also possible to retrieve data for the most recent runs going back at most 90 days. For example, to retrieve job execution data for a particular workflow the following API can be called:

GET https://circleci.com/api/v2/insights/{project-slug}/workflows/{workflow-name}/jobs/{job-name}

The response includes when the job started, stopped, and its status. It also provides the number of credits used in running the job.

The new API uses an identifier for projects that corresponds to their upstream repository known as the project_slug. The project_slug is of the form <project_type>/<org_name>/<repo_name>. For example, for a GitHub repository https://github.com/CircleCI-Public/circleci-cli, the project_slug would be gh/CircleCI-Public/circleci-cli. For project_type it is possible to use github or gh for GitHub, or bitbucket or bb or BitBucket repositories.

The release of the version 2 API saw pipelines becoming a first-class citizen in the API. In 2019, CircleCI renamed the concept of “build processing” to “pipelines”. According to Nathan Dintenfass, Product Manager with CircleCI, a pipeline is

is a unit of work requested of CircleCI. Each is triggered on a given project, on a given branch (your default branch applies as default value), by a particular actor with a particular configuration (or an implied way to retrieve configuration upstream, as with our historic GitHub integration).

The v2 API now allows for the addition of parameters to pipeline triggers. This is accomplished by passing the parameters key in the JSON packet in the POST body. In version 2 pipeline parameters are resolved during configuration processing. This allows them to be available to most parts of the configuration. This differs from the v1.1 API in which parameters were injected directly into the job environment.

Pipeline parameters are declared using the parameters object in the config.yml file. They are then referenced using pipeline.parameters. For example, the config below defines two pipeline parameters: image-tag and workingdir:

version: 2.1
parameters:
  image-tag:
    type: enum
    default: "latest"
    enum: ["latest","bleeding","stable201911"]
  workingdir:
    type: string
    default: "main-app"

jobs:
  build:
    docker:
      - image: circleci/node:<< pipeline.parameters.image-tag >>
    environment:
      IMAGETAG: << pipeline.parameters.image-tag >>
    working_directory: /Code/<< pipeline.parameters.workingdir >>
    steps:
      - run: echo "Image tag used was ${IMAGETAG}"
      - run: echo "$(pwd) == << pipeline.parameters.workingdir >>"

There are also additional clauses when and unless to help decide whether or not to run a particular workflow. According to Dintenfass, “The most common use of this construct is to use a pipeline parameter as the value, allowing an API trigger to pass that parameter to determine which workflows to run.”

The CircleCI API v2 has full support for OpenAPI 3. There are live specs for the production API available for both JSON and YAML. The preview release documentation for future versions of the API is also available for review.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Google Cloud Healthcare API Now Generally Available

MMS Founder
MMS Steef-Jan Wiggers

Article originally posted on InfoQ. Visit InfoQ

In a recent blog post, Google announced the general availability of its Cloud Healthcare API. This service facilitates the exchange of healthcare data between solutions built on Google Cloud Platform (GCP) and applications. 

The public cloud provider launched the Cloud Health Care API over two years ago for the first time to provide customers with a robust, scalable infrastructure solution to ingest and manage key healthcare data types, including HL7, FHIR and DICOM. Moreover, customers can use this data for analytics and machine learning in the cloud. Last year the service was released in beta, and now it is generally available.

Google Healthcare and Life Sciences directors Joe Corkery and Aashima Gupta wrote in the announcement blog post:

With the Cloud Healthcare API and our partner ecosystem, our goal is to make it as simple as possible for the industry to make informed, data-driven decisions, so that caregivers can focus on what matters most: saving lives.

And with the current COVID-19 pandemic:

As the industry is pushed to its limits in light of COVID-19, the need for increased data interoperability is more important than ever. In the last few months, the industry has laid the foundation for progress with final rules issued by CMS and ONC, implementing key provisions of the 21st Century Cures Act. Today, healthcare organizations are in dire need of easy-to-use technology that supports health information exchange.  

Since the Cloud Healthcare API deals with privacy-sensitive data and needs to be secure, Google has integrated it with several data loss prevention, policy and identity management tools. Users of the API can select the region in which to store their data, implement practices such as the principle of least privilege, and track all interactions with their data leveraging Google’s Cloud Audit Logs. Furthermore, Google ships the Cloud Healthcare API with connectors to various of its platform streaming data processing tools such as BigQuery, AI Platform, Dataflow and Looker. Also, the API features an automated de-identification capability that works by obfuscating or removing any personally identifiable information to allow using the data for training and evaluating machine learning models.


Source: https://cloud.google.com/blog/topics/healthcare-life-sciences/getting-to-know-the-google-cloud-healthcare-api-part-1

Currently, Google is not the only public cloud provider offering services for healthcare in the cloud. Last year in November, Microsoft released the Azure API for Fast Healthcare Interoperability Resource (FHIR) in GA as part of a series of APIs to help customers with machine learning on protected health information in the cloud. Furthermore, on the AWS, Amazon offers its Medical Comprehend service, which leverages machine learning to extract relevant medical information from unstructured text.

A respondent on a Reddit thread on the GA release wrote:

Many upsides to democratizing patient data between the providers and the consumers. This google’s API release for GA will most definitely open up new opportunities in the healthcare space. It’s a dream come true for several startups clamouring for this data, to provide a preventative cure.

More details of the Google Cloud Healthcare API are available on the documentation landing page, and pricing details are available on the pricing page.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Article: Reinforcement Machine Learning for Effective Clinical Trials

MMS Founder
MMS Dattaraj Jagdish Rao

Article originally posted on InfoQ. Visit InfoQ

In this article, author Dattaraj explores the reinforcement machine learning technique called Multi-armed Bandits and discusses how it can be applied to areas like website design and clinical trials.

By Dattaraj Jagdish Rao

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Architecture Decision Records At Spotify

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

Several teams at Spotify use architecture decision records (ADR) to capture decisions they make. ADRs have brought a number of benefits to Spotify, including improved onboarding for new developers, improved agility when handing over project ownership due to organization changes, and improved alignment across teams regarding best practices.

An Architectural Decision (AD) is a software design choice that addresses a functional or non-functional requirement that is architecturally significant.

Architectural decision records are a technique that is often used in agile contexts, due to their constantly evolving nature. As agile expert Michael Nygard wrote,

Architecture for agile projects has to be described and defined differently. Not all decisions will be made at once, nor will all of them be done when the project begins.

Architecture decision records include information to understand the context that led to a given decision as well as its consequences. Additionally, they can also document decisions that were not made and the reasons why.

Deciding when an ADR should be written is not always easy to do, since there are multiple ways of understanding when a decision has a significant impact on a project, says Spotify engineer Josef Blake. In his experience, there are at least three scenarios where writing an ADR should be a no-brainer.

First of all, you will want to write an ADR to capture a past decision that was not documented. This will ensure it will be clear to everyone that decision exists.

Similarly, if a decision was made but it was never recorded, can it be a standard? One way to identify an undocumented decision is during Peer Review. The introduction of a competing code pattern or library could lead the reviewer to discover an undocumented decision.

Often, writing an ADR is the final step in the process of making a change that will have a large impact on a system, for example a change that would break an API. In such cases, Spotify engineers use to write request for comments (RFC) as a means to facilitate all stakeholders to agree on a common approach. Once the RFC process is completed, the solution agreed upon is captured in an ADR.

ADRs should not be written only for decisions with a large impact, though, remarks Blake.

The cost of undocumented decisions is hard to measure, but the effects usually include duplicated efforts (other engineers try to solve the same problems) or competing solutions (two third-party libraries that do the same thing).

In such cases, writing an ADR has the added benefit of not being particularly complex.

Architectural decision records are by no means a novel technique. In particular, lightweight decision records were on ThoughtWorks technology radar for a couple of years. If you are interested in giving it a try, you can find additional information as well as ready-to-use templates in this repository.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Presentation: Building a DevSecOps Pipeline Around Your Spring Boot Application

MMS Founder
MMS Hayley Denbraver

Article originally posted on InfoQ. Visit InfoQ

Hayley Denbraver looks into the tools, methodology, culture, and process changes to consider so that an organization is ready for the transformation needed for a DevSecOps pipeline.

By Hayley Denbraver

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Presentation: Day 3: Security Auditing and Compliance

MMS Founder
MMS David Zendzian Steve White

Article originally posted on InfoQ. Visit InfoQ

David Zendzian, Steve White discuss how to handle ongoing security requirements running on Cloud Foundry platforms.

By David Zendzian, Steve White

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.