December 2019 - Page 3 of 12 - Mobile Monitoring Solutions

Uncategorized

Poor Random Number Generation Makes 1 in Every 172 RSA Certificates Vulnerable

MMS • Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

Research report by firm KeyFactor shows many IoT and network devices are using weak digital certificates that make them vulnerable to attack. Researchers Jonathan Kilgallin and Ross Vasko analyzed 75 million RSA certificates and found 1 in 172 keys share a factor with another, which means they can be easily cracked.

Indeed, if two prime numbers share a common factor, this can be known by calculating their greatest common divisor (GCD). This in turns enables calculating the rest of the divisors of both prime numbers, from which you can derive the private key associated to that certificate. Since calculating the GCD is rather easy, this approach may be scaled to a large number of public keys that are publicly available on the Internet. In other words, mining public keys and calculating the GCD of pairs of them opens the way to easily breaking them if a common factor is found.

This result, and the possibility of launching an attack to break RSA key leveraging it, was already known, although it was not considered to be a major concern in practice:

Despite the large number of keys broken by this attack previously, it is still unlikely that a key that has been properly generated with a sufficient amount of entropy could be broken with this technique.

Kilgallin and Vasko have now shown the assumption keys are generated with enough entropy is not verified in a worryingly high number of cases. Specifically, this seems to happen specifically with IoT appliances, which may not easily have enough entropy available to generate keys with a high level of randomness.

With modest resources, we were able to obtain hundreds of millions of RSA keys used to protect real-world traffic on the Internet. Using a single cloud-hosted virtual machine and a well-studied algorithm, over 1 in 200 certificates using these keys can be compromised in a matter of days.

There are a number of additional reasons that make IoT devices specifically prone to this kind of attack, besides the lack of sufficient entropy. One such reason is the probability of succeeding in this kind of attack grows with the number of certificate pairs available to analysis, which has drastically increased due to the growth of the IoT market. Additionally, IoT devices are harder to patch, which makes it more likely to find vulnerabilities in devices that are no longer actively supported.

Although KeyFactor researchers focused on RSA, their results could be extensible to other algorithms relying on random number generation, such as Elliptic-Curve Cryptography (ECC), they say.

It is not the first time concern about IoT security is raised, with the latest disclosed vulnerability affecting a vast class of IoT devices being only week old. Kilgallin and Vasko’s work bring again the focus on the importance of using security best practices from the inception of a project and to include support for timely updating both software and cryptography in IoT devices.

Uncategorized

Java 14 Is in Feature-Freeze and Release Rampdown

MMS • Ben Evans

Article originally posted on InfoQ. Visit InfoQ

The release process for Java 14 has begun. JDK 14 is now in Rampdown Phase One, which means that the overall feature set is frozen and no further features will be targeted to this release.

As is usual for Java releases, a list of JEPs (Java Enhancement Proposals) forms the content for the new version. The set of finalized features is as follows:

Two JEPs deal with the arrival of the ZGC garbage collector on additional platforms:

Next comes several JEPs that relate to Preview or Incubating Features:

This group of JEPs is perhaps more interesting that it initially appears – it actually contains two important building blocks of a major set of new features for Java.

The records feature essentially brings named tuples to Java, and is the first half of the feature called algebraic data types in other languages. The other half is the sealed types feature, which is JEP 360 and is not yet targeted for any release.

The other building block of a future feature is JEP 305 (“Pattern Matching for instanceof“). This feature seems at first sight to be very small, and for now just reduces the boilerplate of unsightly casts when using the `instanceof` operator:

if (obj instanceof String s) {
    // can use s here
} else {
    // can't use s here
}

Although it seems almost trivial, the real power of this feature will only arrive in a future version of Java. The switch expression feature (which is also being standardized as part of Java 14) will be used to build on JEP 305 to produce general pattern matching – which is a major new feature, especially when combined with algebraic data types.

Finally, a group of JEPs that cannot strictly be said to be features – as they deal only with the deprecation or removal of capabilities.

This means that JDK 14 is something of a sad milestone – it marks the end of the road for Java on Solaris, which was the platform where it made its first appearance, back in 1995 as part of the first public release of Java technology.

The removal of the CMS collector is also notable. For almost all modern workloads, G1 performs as well or better than CMS (after a long period of maturation and stabilization). However, there remain a small class of low-latency, pause-sensitive applications that can neither tolerate G1’s pause thresholds nor pay the performance overhead of collectors like Shenandoah or ZGC. No solution is offered for these workloads – in practice they must remain on Java 11 to be supported in the short-to-medium term.

Overall, Java 14 represents a significant step forwards for the platform, although the major features it contains are only released in Preview state. It is also true that to date, the Java market has not seen any significant uptake of non-LTS releases, so it remains to be seen whether Java 14 will move the adoption needle very much.

With the feature freeze and rampdown of Java 14 underway, the mainline of the Java development repositories is now looking towards Java 15 (which should arrive in September 2020).

Uncategorized

Experience Running Spotify’s Event Delivery System in the Cloud

MMS • Jan Stenberg

Article originally posted on InfoQ. Visit InfoQ

Event delivery is a key component at Spotify; the events contain data about users, actions they take, and operational logs from hundreds of systems that are crucial for successfully running the business. The current event delivery system is running on the Google Cloud platform and at the end of Q1 2019 was handling more than 8 million events/s at peak globally, which corresponds to over 350 TB of raw event data daily flowing through the system. After running the event delivery system in the cloud for 2 ½ years, Bartosz Janota and Robert Stephenson have written a blog post discussing what they have achieved and how they have been able to evolve and simplify the system by moving up the stack in the cloud

When in 2015 Spotify decided to move its infrastructure to the Google platform, it was clear that they also had to redesign their event delivery system and adapt it for the cloud. It took the team almost a year to design, write, deploy and scale the current Cloud Pub/Sub-based event delivery system fully into production. In order to succeed with this, they kept the producing and consuming interface compatible with the old system, which also gave them the ability to run both systems in parallel. They had originally planned for a staged rollout, but in the end they instead rolled out the new system for all traffic in one day and that worked fine. The old Kafka-based system was turned off in February 2017.

To be able to build a system capable of handling this large number of events, Janota and Stephenson point out a few principles, strategies and great decisions they believe they have made. To prevent high volume events from disrupting the business-critical data, they separate events by event type at the entry point and isolate the corresponding event streams as soon as possible. By giving each type its own Pub-Sub topic, ETL process (extract, transform, load), and location of the final storage, they can deliver each type individually. They are also able to prioritize work and resources during incidents so that they can deal with the most important event types first. Separating on event type also allows them to prioritize liveness over lateness. When one event type experiences problems or is blocked, it’s still possible to consume other event types due to the separation.

The system is built up by almost 15 different microservices deployed on 2,500 VMs. This allows them to work on each of them individually and replace any if needed. Some of these services are auto scaled and with all these instances their biggest challenge is that the state changes all the time. Deployments of the whole fleet may take up to three hours which means that although the system is designed for rapid iterations, they still have quite a long iteration cycle, and this is one of their pain points.

The system was designed before GDPR, which resulted in them spending a lot of time to make it compliant when the regulation became effective. Now GDPR is a primary concern for them whenever a system that handles personal data is designed.

One lesson they learned during the work was that data grows an order of magnitude faster than service traffic. More active consumers create organizational growth with an increasing number of engineers and teams, and they will introduce new features and instrument them. This creates more data and a need for even more data engineers and scientists looking into the data to gain even more insights. More insights then result in more features, and the growth compounds.

Other strategies they have adopted include using managed services for problems that are not core for the business, and when testing new ideas, being prepared to fail fast and recover even faster.

In the next blog post in this series, Janota and Stephenson intend to describe their work on the next generation of event delivery.

In a presentation at the Big Data Conference in Vilnius 2019, Nelson Arapé discussed the evolution of Spotify’s event delivery system and the lessons learned along the way.

In a presentation at QCon London 2017, Igor Maravic gave a high level overview of the event delivery system and some of the key operational aspects. In a three-part series of blog posts in 2016, he described how they moved the system to the cloud.

Uncategorized

Ionic React Released

MMS • Dylan Schiemann

Article originally posted on InfoQ. Visit InfoQ

The Ionic team recently announced the first production release of Ionic React, a version of Ionic that leverages React to build applications for iOS, Android, Desktop, and Progressive Web Apps (PWA).

First announced as part of the Ionic 4 release, Ionic React leverages the react-dom library. Rather than building an alternative to React Native, Ionic React wraps web APIs rather than native controls and APIs.

To support its capabilities, Ionic React leverages two other open-source Ionic dependencies, Capacitor and Stencil. Stencil supports the generation of highly efficient components and is part of Ionic React’s PWA solution. In contrast, Capacitor is a modern replacement for Cordova or PhoneGap, leveraging modern JavaScript and web features with deployment across iOS, Android, Electron, and the web.

To use Ionic React, developers first install the Ionic CLI:

npm i -g ionic

Then, a new React project should get created:

ionic start my-react-app

The Ionic CLI provides a series of interactive questions to answer to build an application, generates a starter template, and provides a default HTTP server leveraging Create React App to compile, start, and open a project.

Ionic React applications by default leverage TypeScript, though it is simple to switch to vanilla JavaScript if preferred. Ionic React works with functional components by default, provides theming, routing, React hooks, and many more React features, gathered together in a supported environment.

As explained by Ionic CEO Max Lynch, Ionic React provides optional support beyond typical open-source software:

Ionic is bringing something different to the React and cross-platform ecosystem: a fully-supported, enterprise-ready offering with services, advisory, tooling, and supported native functionality available to teams that need more than they can get through community-supported options.

Ionic React is part of the Ionic Framework and is open source software available under the MIT license. Contributions and feedback are encouraged via the Ionic GitHub project and should follow the Ionic contribution guidelines and code of conduct.

Uncategorized

JetBrains Releases AWS Toolkit for Rider

MMS • Arthur Casals

Article originally posted on InfoQ. Visit InfoQ

Earlier this month, JetBrains released the Rider version for its AWS Toolkit, an IDE plugin aimed at helping developers to build, test and deploy serverless applications in the Amazon Web Services platform. This release also includes support for Node.JS (in WebStorm) and updates to its first version, available since March of this year for Java and Python developers using IntelliJ IDEA and PyCharm, respectively.

The new release adds support for C# developers using Rider and features multiple helper functionalities such as a .NET Core project template for AWS Lambda applications, automatically created run configurations for Docker environments, and AWS credential and regions management tools.

An exciting feature contained in the new release is the Cloud Debugging functionality, which allows a developer to debug a .NET Core application on a Linux container while it’s deployed in ECS. This functionality is currently available in beta in AWS, and it is a result of a joint effort between Amazon and JetBrains.

The plugin also contains a visual tool called AWS Explorer, similar to the one also present in the AWS Toolkit for Visual Studio. It allows the deployment and management of AWS Lambda functions and related AWS resources (such as creating S3 buckets to hold deployment artifacts). Support for AWS databases (such as RDS, Aurora, and RedShift) is provided by JetBrain’s multi-database tool (DataGrip) and integrated into Rider.

Other features contained in the new release include the possibility of deploying AWS Lambda directly from the template.yaml file and gutter icons to run and debug Lambda handlers directly from either the C# file or from the CloudFormation configuration file.

Using AWS Toolkit with Rider requires the installation of Amazon’s AWS CLI and AWS SAM CLI tools (you also need to have an AWS account, configured with the correct permissions required by the CLI tools). Docker is also required since it is used to run Lambda locally. Installing .NET Core 2.0 or 2.1 is also required by the plugin (even if your application targets .NET Core 3.0).

JetBrains’s AWS Toolkit can be downloaded here for IntelliJ IDEA, PyCharm, Rider, and WebStorm. A complete tutorial on using the plugin with Rider can be found here, and the Cloud Debugging feature is detailed here. The plugin is open-sourced on GitHub.

Uncategorized

V8 JavaScript Engine 8.0 Reduces Heap by 40%, Adds Optional Chaining and Null Coalescing

MMS • Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

The latest release of Google’s V8 JavaScript engine, V8 8.0, uses pointer compression to reduce heap size by 40% and with no performance hit. Additionally, it adds support for optional chaining using the ?. operator and for nullish coalescence using ??. V8 v8.0 will be officially available with Chrome version 80.

V8 v8.0 applies compression to JavaScript tagged values, which are used to represent pointers into the heap or small integers. Instead of using the full 64 bits required to represent a pointer on a 64-bit CPU, V8 will only use the lower bits and synthesize the higher bits. The V8 team has not released detailed information about their approach to pointer compression but this technique is already used by other platforms, including Java. The general idea is you conceptually organize your memory into words instead of bytes. If you use 8-byte words, you will only need to represent addressed starting at locations 0, 7, 15, 23, etc. The higher (or lower, actually, depending on your underlying architecture) bits of any meaningful address are always zero in this scenario and you can safely spare them, thus reducing your pointer size.

Significantly, says the V8 team, pointer compression does not exact a performance cost. This is possibly related to the fact that going from a compressed pointer to a full pointer in the Java case depicted above only entails bit shifting, which is a rather fast operation. In V8’s case there is an additional bonus, which is the garbage collector becoming faster, too. This makes V8 v8.0 actually faster on real web sites such as Facebook, CNN, and Google Maps both on mobile and desktop devices, according to preliminary benchmarks.

On the JavaScript side, V8 v8.0 introduces support for two useful syntax conventions: optional chaining and nullish coalescence.

Optional chaining aims to make it easier to access properties in sequence without incurring the risk of getting an exception because some intermediate object is null or undefined. For example, to prevent the possibility of such an error occurring, in the following code we check in advance that all intermediate properties we are going to access are well defined:

if (resource && resource.address && resource.address.types)
  return resource.address.types.length

This can be replaced through the following code, where we use the optional chaining operator ?. to make sure the overall expression is short-circuited to undefined as soon as an intermediate component is null or undefined:

return resource?.address?.types?.length

The nullish coalescence operator, ?? is a refinement of || when used in the following context:

let iterations = settings.iterations || 4;

The disadvantage of || in such context is it cannot be used when the value you want to set, in the above example settings.iterations, evaluates to false, e.g., when settings.iterations == 0. In this case you would still end up using the default value, e.g. 4. On the contrary, the nullish coalescing operator ?? will correctly handle such cases, i.e.:

let iterations = settings.iterations ?? 4;

In other words, a ?? b evaluates to b only when a is null or undefined, otherwise it evaluates to a.

V8 v8.0 is not yet the official stable V8 release and will ship in a few weeks in Chrome 80 stable. In the meantime, you can access it using git checkout -b 8.0 -t branch-heads/8.0.

Uncategorized

High Availability for Self-Managed Kubernetes Clusters at DT One

MMS • Hrishikesh Barua

Article originally posted on InfoQ. Visit InfoQ

The engineering team at DT One, a global provider of mobile top-up and reward solutions, wrote about how they implemented IP-failover based high availability for their self-managed Kubernetes cluster ingress on Hetzner’s hosting platform.

DT One runs their Kubernetes clusters on bare metal machines on Hetzner. The cluster has an nginx-based Kubernetes ingress which exposes services to the internet. After trying various approaches to achieve high-availability (HA) for the ingress nodes, they settled on a Puppet-automated IP-failover based solution leveraging Hetzner’s “vSwitch” virtual network.

Kubernetes clusters expose services to external networks like the internet by using a Layer 7 (L7) ingress. Most cloud providers that provide managed Kubernetes also provide an ingress implementation with a load balancer. However, self-managed Kubernetes ingresses usually depend on nginx as a load balancer.

To add high-availability to such setups, Kubernetes needs a VIP + keepalived like solution when there are multiple IPs exposed for external traffic. keepalived is a tool that provides HA using the Virtual Router Redundancy Protocol (VRRP) by switching a virtual IP between hosts. For example, there might be multiple ingress nodes that are configured in round-robin DNS. When a node fails, it has to be removed manually from DNS. If VIP is used, the DNS name will point to just a single IP (the virtual IP) and keepalived will ensure it always points to a live node running ingress. For cloud platforms like GCP, AWS and Azure that provide a load balancer, VIPs are unnecessary as the platform takes care of providing an HA load balancer. However, on platforms where the LB is managed by the customer, VIP can provide HA.

InfoQ got in touch with Jan Hejl, DevOps Tech Lead at DT One, to understand more about the solution.

Usually, the ingress ports are bound to the main host’s IP. Hetzner provides a failover IP feature where an IP address (or even a subnet) can be switched from one server to another, irrespective of the server’s location within 60 seconds. The team initially used custom Python scripts to switch Hetzner’s failover IPs between ingress nodes, managed by keepalived. They later adopted a modified version of an existing solution, but it had some drawbacks like being forced to use encrypted VRRP and stick to IPv4. The newer VRRPv3 protocol supports IPv6, but encryption was not possible. Hejl explains the security issues:

A bare-metal machine from Hetzner is part of a /29 or even a /26 subnet, so others can sniff something (say, using tcpdump) that is not part of their own traffic. Especially in the case where the IPs are within the same subnet, spoofing the multicast IP address is not that hard even though you have implemented things like arp_ignore / rp_filter etc.

Since it’s a self-managed L7 ingress, how does DT One protect against DDoS like attacks? Hejl explains that “Hetzner is the first level of defense and then there are our own firewalls.”

DT One uses Puppet for almost everything, says Hejl, with Terraform for automating Hetzner virtual machines or AWS deployments. Puppet was also used to automate the initial solution. This was superseded by a feature that Hetzner introduced last year called vSwitch. vSwitch allocates a separate Layer 2 (L2) network for customer machines, which means that unencrypted VRRP traffic becomes possible without the security concerns. However, there were still issues with Hetzner’s failover IPs. The time taken to reflect changes (~30 seconds) across the network was too long, and it was susceptible to any outages that might occur at Hetzner.

The team finally arrived at a working solution using keepalived and 3 physical hosts that communicate over a separate vSwitch network, automated using Puppet. Each of the nodes acts as a leader for the other two VIPs, with the remaining two as followers. keepalived supports email notifications when the status of a node changes. In addition, Hejl says, they use Prometheus, Grafana and Alertmanager for monitoring and alerting their systems.

Uncategorized

James Ward and Ray Tsang on Knative Serverless Platform

MMS • Srini Penchikala

Article originally posted on InfoQ. Visit InfoQ

At this year’s QCon San Francisco 2019 Conference, speakers James Ward and Ryan Knight hosted a workshop on Serverless technologies using the Knative framework.

Kubernetes has become the popular choice for managing and orchestrating container based applications. Service mesh technologies like Istio can be used to manage service to service communications and monitoring. With the introduction of Knative, a platform built atop of Kubernetes and Istio, development teams can now build, deploy, and manage workloads using serverless architecture.

James Ward and Ray Tsang facilitated the same workshop at the QCon New York 2019 Conference. InfoQ spoke with them to discuss the role of serverless in developing cloud native applications and how Knative helps with these goals.

InfoQ: Can you tell our readers what the Knative framework is and how it fits in with Kubernetes container management platform?

James Ward and Ray Tsang: Knative is a layer on top of Kubernetes which provides the building blocks for a serverless platform, including components for building an app from source, serving an app via HTTP, and routing events with publishers and subscribers. Knative extends Kubernetes features like scale to zero, crash-restarts, and load balancing, and lets you run serverless workloads anywhere you choose: fully-managed on Google Cloud, on Google Kubernetes Engine (GKE), or on your own Kubernetes cluster. Knative makes it easy to start with platforms like Google Cloud Run and later move to Cloud Run on GKE, or start in your own Kubernetes cluster and migrate to Cloud Run in the future. Originally built at Google, Knative is open source and you can find details on the website.

InfoQ: In terms of use cases, when should and shouldn’t we use the serverless based solutions, compared to other architectures like microservices or monolith apps?

Ward and Tsang: Serverless is an operational model that scales up and down based on demand. In cloud environments, this allows billing to be entirely based on actual usage. In self-managed environments this enables the underlying server resources to be returned to a shared pool for use elsewhere. Serverless can fit with both microservice and monolithic architectures, however most monoliths don’t do very well in a serverless world where apps should start quickly and not use global state.

InfoQ: You discussed the CNCF Buildpacks in your workshop. Can you talk about how these Buildpacks help with serverless apps in general, and Knative based apps in particular?

Ward and Tsang: The open source Tekton project (Github repo) is a complement to Knative for transforming source into something that can run on Knative. Tekton runs on Kubernetes and provides a serverless experience as resources are allocated on demand to run builds and CI/CD pipelines. CNCF Buildpacks is a standard for detecting a project type and running its build to create a container image. There are buildpacks available for tools including Maven or Gradle (for Java projects), Python, Ruby, Go, Node, etc. In the Tekton extension catalog you will find Buildpack support that makes it very easy to go from source to a container image that can run in Knative.

InfoQ: Can you use microservices and Knative functions in the same application or use case?

Ward and Tsang: The deployment unit for all invokables in Knative is a container image which ultimately runs on a Kubernetes pod. Invokables includes apps served via HTTP and events which are sent a Cloud Event via HTTP. So you definitely could have a single container image which can handle both web HTTP requests, REST, and Cloud Events.

InfoQ: Service mesh technologies are getting a lot of attention lately. Can you talk about how Knative applications can work in a service mesh-based architecture?

Ward and Tsang: Knative is currently built on top of Istio and Kubernetes. Knative applications are automatically running in an Istio service mesh environment. What that means is you automatically get all the benefits from Istio like distributed tracing sampling, and monitoring metrics. When you configure traffic splitting in Knative, it automatically generates the corresponding configuration in Istio to implement the traffic split.

InfoQ: You also talked about Cloud Events in your workshop. Can you discuss how Cloud Events can be leveraged in serverless applications?

Ward and Tsang: Cloud Events are a recent Cloud Native Computing Foundation (CNCF) specification for RESTful event payload packaging. Microservice and Function-as-a-service frameworks are beginning to support Cloud Events to make it easier to parse incoming messages and provide routing metadata. Knative supports Cloud Events in the Eventing component by sending Cloud Events to backing services. If the response from a service handling a Cloud Event is another Cloud Event, it can then be handled by the Eventing system for further routing. Event consumers are services like those that handle HTTP requests in the Serving component of Knative. This means they can scale up and down (even to zero), just like other services.

InfoQ: What are some tools developers can use when working Knative-based applications?

Ward and Tsang: Knative can run any Docker container image on Kubernetes, but these images can alternatively be run directly in Docker for local development and testing. So you may not need to actually use Knative at development time. Also, since Knative is an extension of Kubernetes, the tools that work in Kubernetes also work with Knative. So you can use tools like Prometheus for monitoring, Fluentd for logging, Kibana for log search, and many others.

InfoQ: Can you recommend any best practices for our readers who want to learn more about serverless and Knative technology, what to consider and what to avoid?

Ward and Tsang: Learn the technology, its use cases, when to use it, and when not to use it. Even though Knative operates at a higher level, it’s layered on top of technologies like Istio and Kubernetes. To run Knative, it’s important to understand the underlying layers, so that when there is a platform issue, you would know where to look and troubleshoot.

For more information on Knative, check out their website and the documentation on how to install and use it.

Uncategorized

Article: Top Picks for QCon 2019

MMS • Wesley Resiz

Article originally posted on InfoQ. Visit InfoQ

In 2019, QCon, the International Software Conference for senior engineers and architects, visited seven cities (San Francisco, London, New York, Sao Paulo, Beijing, Guangzhou, and Shanghai). I had the privilege to chair the three English speaking conferences in San Francisco, London, and New York.

For the English QCons alone, 4,175 attendees engaged with talks from some of the leading minds in software. These are people like John Allspaw (one of the original thought leaders in DevOps), Bryan Cantrill (Co-Creator of DTrace, previously CTO Joyent, and now Co-Founder/CTO of Oxide Computer Company), Sarah Wells (Technical Director for Operations and Reliability London’s Financial Times), and Diane Davis (Astrodynamicist and Principle System Engineer NASA). Just under 500 speakers gave talks ranging from implementation practices using Kubernetes to practical lessons building machine learning at some of the most innovative shops in the world. These were the technologies shaping how we create software today.

After each conference, we’re often asked what were the most interesting talks. So each morning during the conference, I make a recommendation on what I thought were the best talks of the previous day. These are sometimes based on ratings and attendance, but more often they are based simply on the buzz around the conference or on social media. There’s no hard checklist that we use to qualify for the list. It’s simply what I felt were the talks you don’t want to miss from that day. The videos for these talks (and all the others) are always immediately available to attendees, but are freely released over several weeks to everyone else on InfoQ.

QCon San Francisco

Day 1

Parsing JSON Really Quickly: Lessons Learned

by Daniel Lemire (Professor and Department Chair @TELUQ – Université du Québec)

Mistakes and Discoveries While Cultivating Ownership

by Aaron Blohowiak (Engineering Manager @Netflix in Cloud Infrastructure)

Scaling Patterns for Netflix’s Edge

by Justin Ryan (Playback Edge Engineering @Netflix)

Automated Testing for Terraform, Docker, Packer, Kubernetes, and More

by Yevgeniy Brikman (Co-founder @gruntwork_io)

JIT vs AOT Performance With GraalVM

by Alina Yurenko (Developer Advocate for GraalVM @Oracle)

Day 2

Beyond Microservices: Streams, State and Scalability

by Gwen Shapira (Principal Data Architect @Confluent, PMC Member @Kafka, & Committer Apache Sqoop)

Build Your Own WebAssembly Compiler

by Colin Eberhardt (Technology Director @Scott_Logic)

Mapping the Evolution of Socio-Technical Systems

by Cat Swetel (Agile Methods Coach & Advocate for Woman in Tech)

Holistic EdTech & Diversity

by Antoine Patton (Holistic Tech Coach @unlockacademy)

Practical Change Data Streaming Use Cases With Apache Kafka & Debezium

by Gunnar Morling (Open Source Software Engineer @RedHat)

Day 3

Evolution of Edge @ Netflix

by Vasily Vlasov (Engineering Leader @Netflix)

Optimizing Yourself: Neurodiversity in Tech

by Elizabeth Schneider (Consultant @Microsoft)

Security Culture: Why You Need One and How to Create It

by Masha Sedova (Co-Founder @hello_Elevate)

ML in the Browser: Interactive Experiences with Tensorflow.js

by Victor Dibia (Research Engineer in Machine Learning @cloudera)

How to Invest in Technical Infrastructure

by Will Larson (Foundation Engineering @Stripe)

QCon New York

Day 1

Breaking Hierarchy – How Spotify Enables Engineer Decision Making

by Kristian Lindwall (Site Lead Engineering @Spotify)

The State of Serverless Computing

by Chenggang Wu (CS PhD student at RISELab, UC Berkeley)

PID Loops and the Art of Keeping Systems Stable

by Colm MacCárthaigh (Senior Principal Engineer @awscloud)

Let’s talk locks!

by Kavya Joshi (Software Engineer @Samsara)

Machine-Learned Indexes – Research from Google

by Alex Beutel (Senior Research Scientist @Google)

Day 2

Java Futures, 2019 Edition

by Brian Goetz (Java Language Architect @Oracle)

How Much Does It Cost to Attack You?

by Jarrod Overson (Software Engineer @ShapeSecurity)

Robot Social Engineering: Social Engineering Using Physical Robots

by Brittany Postnikoff (Computer Security and Privacy / Human-Robot Interaction Researcher)

Scaling DB Access for Billions of Queries Per Day @PayPal

by Kenneth Kang & Petrica Voicu (Software Engineers @PayPal)

Empathy: A Keystone Habit

by Paul Tevis (Coach & Facilitator at Vigemus)

Day 3

Rust, WebAssembly and Javascript Make Three: an FFI Story

by Ashley Williams (Core Rust & Rust Wasm WG Team Member)

Ignite the Fire – How Managers Can Spark New Leaders

by Nick Caldwell (Chief Product Officer @LookerData, previously VP of Engineering @Reddit)

Making ‘npm install’ Safe

by Kate Sills (Software Engineer @agoric

Artificial Pancreas System: #WeAreNotWaiting in Healthcare

by Dana Lewis (Principal Investigator & Researcher)

Securing a Multi-Tenant Kubernetes Cluster

by Kirsten Newcomer (OpenShift Senior Principal Product Manager @RedHat)

QCon London

Day 1

Change Is the Only Constant

by Stuart Davidson (Senior Engineering Manager, Developer Enablement @Skyscanner)

Scaling for the Known Unknown

by Suhail Patel (Backend Engineer @Monzo)

Fine-Grained Sandboxing With V8 Isolates

by Kenton Varda (Tech lead @Cloudflare Workers)

Intuition & Use-Cases of Embeddings in NLP & Beyond

by Jay Alammar (VC and Machine Learning Explainer @STVcapital)

Risk of Climate Change and What Tech Can Do

by Jason Box (Climatologist & Professor in Glaciology at The Geologic Survey Denmark)
Paul Johnston (CEO @roundaboutlabs)

Day 2

Building and Scaling a High-Performance Culture

by Randy Shoup (VP Engineering @WeWork)

Discovering Culture through Artifacts

by Mike McGarr (Engineering Leader, Frontend Infra @Slack)

The Three Faces of DevSecOps

by Guy Podjarny (CEO @SnykSec, previously CTO @Akamai)

Complex Event Flows in Distributed Systems

by Bernd Rücker (Co-founder @Camunda)

WebAssembly and the Future of the Web Platform

by Ashley Williams (Core Rust Team @RustLang)

Day 3

Crossing the River by Feeling the Stones

by Simon Wardley (UK Tech Influencer & Researcher Leading Edge Forum)

Surviving the Zombie Apocalypse

by Andy Walker (Senior Director of Engineering @Skyscanner)

Functional Composition

by Chris Ford (Technical Principal @ThoughtWorksESP)

Amplifying Sources of Resilience: What Research Says

by John Allspaw (DevOps/Resilience Engineering Thought Leader, Co-founder of @AdaptiveCLabs, & Previously CTO @Etsy)

Designing an Ethical Chatbot

by Steve Worswick (4 Time Winner of Loebner Prize (Most lifelike AI) & Senior Artificial Intelligence Designer @pandorabots)

This article is part of our 2019/2020 trends overview. The insights come from our editorial team, all of whom are software engineers, who push the barrier of innovation in their professional lives. Read and reflect on their insights to inspire your tech visions and roadmap for 2020.

Uncategorized

Preventing and Dealing with Vulnerabilities with GitLab

MMS • Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

One year after the official launch of GitLab public bung program, it is time for the company to wrap up the results it brought and how it helped improving security for GitLab and its customers. InfoQ took the chance to speak with GitLab Senior Application Security Engineer James Ritchey to learn more about GitLab security strategy and what a bug bounty program can contribute to an organization.

GitLab bug bounty program brought impressive results, says GitLab, with hundreds of valid bug reports which were awarded over $500,000.

The program kept our engineers on their toes, challenged and surprised our security team, and helped us keep GitLab more secure.

Taking their bug bounty program public allowed GitLab to learn a number of lessons, from how to triage reports and respond to them to hot to improve communication and transparency. In fact, the company says, they are still learning and actively looking for ways to make their program more effective.

InfoQ took the chance to speak with GitLab Senior Application Security Engineer James Ritchey to learn more about GitLab security strategy and what a bug bounty program can contribute to an organization.

InfoQ: When it comes to security, GitLab has long adopted a policy of transparency and open disclosure. Could you please elaborate on the value this has for GitLab users and investors? Aren’t you worried that disclosing the number and severity of vulnerabilities may impact negatively GitLab’s image?

Ritchey: GitLab as a company holds transparency as one of our core values, because among other things it helps to build customer trust. This extends to our vulnerability program as well. We believe sharing the details of the vulnerabilities that are found and fixed helps build and maintain trust with our customers by demonstrating the importance of building a secure product.

Trust can be lost when a security bug is mishandled and the customer is directly impacted. By disclosing full vulnerability information after 30 days, the customer has time to upgrade or apply mitigations to their environment to protect against those vulnerabilities. Being transparent about our security issues truly illustrates how invested we are in securing GitLab.

InfoQ: In the last few years, the software industry has been constantly improving the way it deals with security and how it protects its systems from vulnerabilities. For example, GitHub provides Security Alerts and code analysis to help developers detect risks as soon as possible. How does GitLab attempt to improve the way developers ensure their software is secure?

Ritchey: GitLab’s focus on enabling the developer to write secure code is reflected in our Secure product team. This team works to build tools that provide security-related information to the developer as soon as their code is pushed to GitLab. This includes dependency analysis to identify vulnerable dependencies and static source code analysis that identifies common insecure coding patterns.

The security team has also contributed tooling in this area. Our red team recently open-sourced a tool called Token Hunter that helps developers discover sensitive data in other areas of GitLab, such as issues or snippets.

InfoQ: A recent report by security firm Snyk highlighted the lack of security ownership among open source developers as one major cause of concern. How does GitLab make sure its developers are aware of their critical role in shipping secure software?

Ritchey: Security awareness communication for developers at GitLab starts at onboarding and continues through their time working on the product. When a developer is onboarded they are provided information on where to find secure coding resources. This includes videos of past secure coding training and a set of guidelines for common vulnerabilities.

Additionally, the application security team has different initiatives throughout the year. One example is bi-weekly office hours that anyone can attend and discuss application security related questions. Another, which will be a first, is that we are planning a Capture The Flag event at this year’s Contribute, our annual community event. We get together to get face-time with one another, build community, and get some work done.

InfoQ: One year ago, you launched a public bug bounty program that awarded security over 150 researchers a total of about $560,000. Could you please summarize how this program performed for GitLab and what you learned through the process?

Ritchey: We received a huge response from the community over the past year, and during this time we learned a lot. One of the most important lessons was that we needed to scale. Specifically scale our communication and procedures. There are many reports and reporters and only a handful of us on the GitLab side. If we didn’t scale, then we’d be smothered in the volume of reports. Our answer to this was to automate as much as possible. For example, an automated bot enabled us to reduce our average time to first response from 48+ hours to less than seven.

Another automation example was that for issues we’ve already triaged and are tracking, reporters would very often ask for updates on when the vulnerability would be fixed. So our automation team developed another bot which would comment on the report with an expected fix date once the issue was scheduled by our product team.

Besides scaling, another significant lesson was that we needed to increase hacker engagement and keep it at a high level. There are many programs which the reporters can choose from on HackerOne, so why should they participate in ours and stick with it? You’re competing for the attention of reporters from over 1000 other programs to choose from. So an important thing for us was to listen to the feedback from reporters currently engaged in the program.

One of the top suggestions was that they wanted to reduce the time of bounty payouts. Previously, we were rewarding bounties once the issue was resolved, which could take anywhere from 1 to 4 months depending on the severity of the issue. So after listening to the feedback, in September 2019 we changed how we reward bounties to pay a partial bounty of $1000 upfront at the time of triage. The remainder would be paid when the report was resolved or 90 days had past, whichever came first.

InfoQ: Establishing a bug bounty program is a key approach to improving the security of a Cloud service. Could you share some suggestions about how to create and run a successful bug bounty program?

Ritchey: I suggest learning from our lessons and other programs which have been around for a while.

Start a public program and start it sooner than later.

Scale with automation.

Keep hacker engagement high in your program.

Pay competitively, especially for High and Critical severity vulnerabilities.

Pay fast.

Have a large enough scope.

I’d also encourage you to be transparent about your security issues. Why?

It can show how dedicated you are to securing your service, thus instilling trust in your customers.

Exactly what were the issues.

When the issue was reported.

How fast you fixed the issue.

By staying secret about all of these things,

There’s no accountability.

There’s no visibility into how committed they are to securing their service

GitLab is a Cloud-based platform for development and DevOps encompassing source code management, wiki, issue tracking, continuous integration/continuous development, and more.

Poor Random Number Generation Makes 1 in Every 172 RSA Certificates Vulnerable

MMS • Sergio De Simone

Subscribe for MMS Newsletter

Did you know...

Java 14 Is in Feature-Freeze and Release Rampdown

MMS • Ben Evans

Subscribe for MMS Newsletter

Did you know...

Experience Running Spotify’s Event Delivery System in the Cloud

MMS • Jan Stenberg

Subscribe for MMS Newsletter

Did you know...

Ionic React Released

MMS • Dylan Schiemann

Subscribe for MMS Newsletter

Did you know...

JetBrains Releases AWS Toolkit for Rider

MMS • Arthur Casals

Subscribe for MMS Newsletter

Did you know...

V8 JavaScript Engine 8.0 Reduces Heap by 40%, Adds Optional Chaining and Null Coalescing

MMS • Sergio De Simone

Subscribe for MMS Newsletter

Did you know...

High Availability for Self-Managed Kubernetes Clusters at DT One

MMS • Hrishikesh Barua

Subscribe for MMS Newsletter

Did you know...

James Ward and Ray Tsang on Knative Serverless Platform

MMS • Srini Penchikala

Subscribe for MMS Newsletter

Did you know...

Article: Top Picks for QCon 2019

MMS • Wesley Resiz

Related Vendor Content

QCon San Francisco

Day 1

Day 2

Day 3

QCon New York

Day 1

Day 2

Day 3

QCon London

Day 1

Day 2

Day 3

Subscribe for MMS Newsletter

Did you know...

Preventing and Dealing with Vulnerabilities with GitLab

MMS • Sergio De Simone

Subscribe for MMS Newsletter

Did you know...