Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
Syndicate content
Updated: 7 hours 50 min ago

Stuff The Internet Says On Scalability For April 28th, 2017

Fri, 04/28/2017 - 17:05

Hey, it's HighScalability time:

 

Do you understand the power symbol? I always think of O as a circuit being open, or off, and the | as the circuit being closed, or on. Wrong! Really the symbols are binary, 0 for false, or off, 1 for true, or on. Mind blown.
If you like this sort of Stuff then please support me on Patreon.
  • 220,000-Core: largest Google Compute Engine job; 100 million: Netflix subscribers; 1.3M: Sling TV subscribers; 200: Downloadable Modern Art Books; 25%: Americans Won't Subscribe To Traditional Cable; 84%: image payload savings using smart CDN; 10^5: number of world-wide cloud data centers needed; 63%: more Facebook clicks using personality targeting; 2.5 million: red blood cells created per second; 

  • Quotable Quotes:
    • Silicon Valley~ The only reason Gilfoyle and I stayed up 48 f*cking straight hours was to decrease server load, not keep it the same. 
    • Robert Graham: In other words, if the entire Mirai botnet of 2.5 million IoT devices was furiously mining bitcoin, it's total earnings would be $0.25 (25 cents) per day.
    • @BoingBoing: John Deere just told US Copyright office that only corporations can own property, humans merely license it
    • mattbillenstein: Lin Clark's talk makes this sound like they implemented a scheduler in React -- basically JS is single-threaded, so they're implementing their own primitives and a scheduler for executing those on that main thread.
    • Robert M. Pirsig: When analytic thought, the knife, is applied to experience, something is always killed in the process.
    • @vornietom: I honestly feel bad for the people on the Placebo March who thought they were at the Science March but double blind testing is important
    • MIT: we can capture and monitor human breathing and heart rates by relying on wireless reflections off the human body.
    • Mohamed Zahran~ Surprisingly enough traditional homogenous multi-core are really heterogeneous. Why is that? Every core is running at its own frequency. Many processors are now a traditional core and a GPU. FPGAs are already with us. Automata Processor is a specialized processor that can execute non-deterministic finite automata (regular expressions) orders of magnitude faster than a GPU.  Neuromorphic brain inspired chips. Fancy GPUs. 
    • @craigbuj: amazing how fast China Internet companies can scale: ofo: 10+ million daily rides in China Uber: ~6 million daily rides globally
    • knz: CockroachDB's architecture is an emergent property of its source code. 
    • @Jason: Good news: over 70b spent on digital ads in 2016.  Terrifying news: 89% of growth was Facebook & Google. Via @iab
    • @swardley: I think we need to stop thinking about AMZN as a future $1T biz and more think about it as a future $10T biz, possibly much more.
    • @timoreilly: "Algorithms are opinions embedded in code." @mathbabedotorg #TED2017 
    • Google: I think we [Google Cloud] have a pretty good shot at being No. 1 in five years
    • limitless__: Folks who think programmer skill declines when you're 40+ are 100% wrong. What declines is your willingness to put up with stupidity and what increases is your willingness and ability to tell someone to fly a kite when they tell you to work stupid hours and do stupid things.
    • @nicusX: "Don't worry about X. X is transparently managed for you". Reads: "When things go wrong you'll never be able to fix it" #mechanicalSympathy
    • defined: What's up is the rampant ageism in the industry - the perception that you are washed up as a "dinosaur" developer after a certain age, maybe 40 or so, and belong in management. We "dinosaurs" - we happy few - are living evidence to the contrary.
    • user5994461: AWS Spot Instances are under bid. The highest bidder takes the instances, the price changes all the time. Google Spot Instances (preemptibles) are 80% off and that's it. It's simple.
    • James Hamilton: in 10 years, ML will be more than 1/2 the worlds server side footprint.
    • qnovo: if we examine the average capacity in smartphones over the past 5 years, we see that it has grown at about 8% annually. A battery in a 2017 smartphone contains about 40 – 50% more capacity (mAh) than it did in 2012.
    • StorageMojo: Bottom line: the NVRAM market is heating up. And that’s a very good thing for the IT industry.
    • Crazycontini: We need a lot more help to clean up the world’s crypto mess.
    • Pramati Muthalaxe: Irrespective of what Facebook says, all of them have one objective — to get more money out of potential advertisers. That requires a constant decay of your reach.
    • danluu: It looks like, for a particular cache size, the randomized algorithms do better when miss rates are relatively high and worse when miss rates are relatively low,
    • There's just too much. To see all Quotable Quotes please click through to the full article.

  • Is Kubernetes the next OpenStack? The Cloudcast #296. No. The core architecture team for Kubernetes ensures there's a consistency accross the project...

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Sponsored Post: Etleap, Pier 1, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Tue, 04/25/2017 - 17:00

Who's Hiring? 
  • Pier 1 Imports is looking for an amazing Sr. Website Engineer to join our growing team!  Our customer continues to evolve the way she prefers to shop, speak to, and engage with us at Pier 1 Imports.  Driving us to innovate more ways to surprise and delight her expectations as a Premier Home and Decor retailer.  We are looking for a candidate to be another key member of a driven agile team. This person will inform and apply modern technical expertise to website site performance, development and design techniques for Pier.com. To apply please email cmwelsh@pier1.com. More details are available here.

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 
Fun and Informative Events
  • DBTA Roundtable OnDemand Webinar: Leveraging Big Data with Hadoop, NoSQL and RDBMS. Watch this recent roundtable discussion hosted by DBTA to learn about key differences between Hadoop, NoSQL and RDBMS. Topics include primary use cases, selection criteria, when a hybrid approach will best fit your needs and best practices for managing, securing and integrating data across platforms. Brian Bulkowski, CTO and Co-founder of Aerospike, presented along with speakers from Cask Data and Splice Machine. View now.

  • Advertise your event here!
Cool Products and Services
  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Stuff The Internet Says On Scalability For April 21st, 2017

Fri, 04/21/2017 - 17:06

Hey, it's HighScalability time:

 

Which do you see: Machines freeing people? Lost jobs? Slavery? Hyperactive Skittles?
If you like this sort of Stuff then please support me on Patreon.
  • year 1899: “Nobody has to use the Internet”; 12MPH: Speed news of Lincoln's assassination traveled the US; $200 million: Lyft tips; 500: data structures and algorithms interview questions; %0.00244140625: Odds of 13 straight male Dr. Who regens; 100: gigafactories could power the world; 100K: bots on Messenger; 1 million: containers Netflix lanched in one week; 5.2 trillion: 2014 US revenue; 52,129: iterations to converge on NFL schedule; 36 Gbps: Facebook's network in the sky; 

  • Quotable Quotes:
    • @mipsytipsy: "That doesn't sound hard. I could build that in a weekend."
    • @Noahpinion: The Elon Musk Future is the good future. The Peter Thiel Future is the bad future. But honestly you'll probably get the Jeff Bezos Future.
    • @BenedictEvans: In 2007 Google, Apple, Facebook & Amazon had maybe 50k staff between them. Today it's more like 400k.
    • @AWSonAir: @Expedia inserting 70,000 rows per second of hotel data with Amazon Aurora.
    • @swardley: STOP! If you're thinking of moving to cloud today (as in IaaS), you are so late that you need to consider moving to serverless ->
    • David Rosenthal: Silicon Valley would not exist but for Ph.D.s leaving research to create products in industry.
    • @cmeik: Distributed applications today treat the database like shared memory, and that's why we love things like Spanner.  This is a flawed design.
    • @Jason: Apple's cash hoard swells to five Teslas / four Ubers / 25 Twitters
Categories: Architecture

Stuff The Internet Says On Scalability For April 14th, 2017

Fri, 04/14/2017 - 17:03

Hey, it's HighScalability time:

 

After 20 years, Cassini will not go gently into that good night, it will burn and rave at close of day. (nasa)
If you like this sort of Stuff then please support me on Patreon.
  • 10^15: synapses activated per second in human brain (2/3rds fail); $4.5B: Amazon spend on video (Netflix $6 billion); 22,000: AWS database migrations served; ~15%: Dropbox reduced CPU usage using Brotli; $3.5 trillion: IT spending in 2017; 10%: reduction in QoQ hard drive shipments; 33.3%: Nginx share of webserver market; 37.2 trillion: human cells in a Cell Atlas; 6.2 miles: journey to the center of the earth; 200: lines of code for blockchain; 95%: Wikipedia pages end up at philosophy; 1.2 billion: Messenger monthly users; 

  • Quotable Quotes:
    • Jeff Bezos: Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.
    • Bob Schmidt: If debugging is the process of removing errors from a design, then designing must be the process of putting errors into a design!
    • @swardley: the gap between where the cutting edge is and where the majority are just seems to increase year on year.
    • Riot Games: We need to provide resources when it's time to grow, we need to react when it gets sick, and we need to do it all as fast as possible at a global scale.
    • masklinn: High-performance native code already does these specialisation, generally on a per-project basis (some projects include multiple allocators for different bits of data), and possibly using a non-OS allocator in the first place
    • @erikbryn: MT: @DKThomp : there are 950k warehouse workers —6X the number of steel workers and miners combined
    • Joeri: The challenge of a rewrite is not in mapping the core architecture and core use case, it's mapping all the edge cases and covering all the end user needs. You need people intimately familiar with the old system to make sure the new system does all the weird stuff which nobody understood but had good reasons that was in the corners of the old system's code. 
    • @redblobgames: 2016 GDC Diablo talk: let's switch from turn-based to real-time 2017 GDC Civilization talk: let's switch from real-time to turn-based
    • @random_walker: Encrypted traffic has a fingerprint—enough to distinguish among 200 Netflix vids with 99.5% accuracy in < 2.5 mins.
    • Sophie Wilson: You’re going to buy a 10-way, 18-way multi-core processor that’s the latest, all because we told you you could buy it and made it available, and we’re going to turn some of those processors off most of the time. So you’re going to pay for logic and we’re going to turn it off so you can’t use it.
    • qq66: But is there anything more personal than a computer programmer writing a bot to send messages for him?
    • Anu Hariharan: Unlike other social products, WeChat does not only measure growth by number of users or messages sent. Instead they also focus on measuring how deeply is the product engaged in every aspect of daily life (e.g., the number of tasks WeChat can help with in a day).
    • @fredwilson: "The real issue here is Facebook’s market power. And we face similar market power issues in search (Google) and commerce (Amazon)"
    • There are so many quotable quotes I couldn't include them all here. Click through to read the full article.

  • Luna Duclos on Game Development and Rebuilding Microservices. Switching from PHP/Python to Go. Go is much faster and uses less CPU. As big as the switch to Go is the switch from Google App Engine to VMs. GAE servers are small and CPU constrained despite the relatively high cost. Their Go cluster runs in the Google Cloud on Google Container Engine.

  • Werner Against the Machine. Wait, aren't you the machine now?

  • Kwabena Boahe on Stanford Seminar: Neuromorphic Chips: Addressing the Nanostransistor Challenge. A dollar bought more and more transistors until 2014, when for the first time the price for transistors went up. Fundamental constraints at the physical level is the cause. The challenge is to continually shrink the footprint of the transistor so it occupies less space. A traffic metaphor is used to explain the difficulty of continually shrinking transistors. Shrinking gives you fewer lanes and electrons can block a lane by being trapped in a pothole. When you get down to one lane and electron is trapped the current flows slowly. Our brains work with ultimately scaled devices...

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Sponsored Post: Pier 1, Aerospike, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Tue, 04/11/2017 - 17:05

Who's Hiring? 
  • Pier 1 Imports is looking for an amazing Sr. Website Engineer to join our growing team!  Our customer continues to evolve the way she prefers to shop, speak to, and engage with us at Pier 1 Imports.  Driving us to innovate more ways to surprise and delight her expectations as a Premier Home and Decor retailer.  We are looking for a candidate to be another key member of a driven agile team. This person will inform and apply modern technical expertise to website site performance, development and design techniques for Pier.com. To apply please email cmwelsh@pier1.com. More details are available here.

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 
Fun and Informative Events
  • DBTA Roundtable OnDemand Webinar: Leveraging Big Data with Hadoop, NoSQL and RDBMS. Watch this recent roundtable discussion hosted by DBTA to learn about key differences between Hadoop, NoSQL and RDBMS. Topics include primary use cases, selection criteria, when a hybrid approach will best fit your needs and best practices for managing, securing and integrating data across platforms. Brian Bulkowski, CTO and Co-founder of Aerospike, presented along with speakers from Cask Data and Splice Machine. View now.

  • Advertise your event here!
Cool Products and Services
  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Five things we’ve learned about monitoring containers and their orchestrators

Mon, 04/10/2017 - 16:56

This is a guest post by Apurva Davé, who is part of the product team at Sysdig.

Having worked with hundreds of customers on building a monitoring stack for their containerized environments, we’ve learned a thing or two about what works and what doesn’t. The outcomes might surprise you - including the observation that instrumentation is just as important as the application when it comes to monitoring.

In this post, I wanted to cover some details around what it takes to build a scale-out, highly reliable monitoring system to work across tens of thousands of containers. I’ll share a bit about what our infrastructure looks like, the design choices we made, and tradeoffs. The five areas I’ll cover:

  • Instrumenting the system

  • Relating your data to your applications, hosts, and containers.

  • Leveraging orchestrators

  • Deciding what to data to store

  • How to enable troubleshooting in containerized environments

For context, Sysdig is the container monitoring company. We’re based on the open source Linux troubleshooting project by the same name. The open source project allows you to see every single system call down to process, arguments, payload, and connection on a single host. The commercial offering turns all this data into thousands of metrics for every container and host, aggregates it all, and gives you dashboarding, alerting, and an htop-like exploration environment.

Ok, let’s get into the details, starting with the impact containers have had on monitoring systems.

Why do containers change the rules of the monitoring game?
Categories: Architecture

Stuff The Internet Says On Scalability For April 7th, 2017

Fri, 04/07/2017 - 16:56

Hey, it's HighScalability time:

 

Visualization of the magic system behind software infrastructure. (eyezmaze@ThePracticalDev
If you like this sort of Stuff then please support me on Patreon.
  • 10-20: aminoacids can be made per second; 64800x: faster DDL Aurora vs MySQL; 25 TFLOPS: cap for F1 simulations; 15x to 30x: Tensor Processing Unit faster than GPUs and CPUs; 100 Million: Intel transistors per square millimeter; 25%: Internet traffic generated by Google; $1 million: Tim Berners-Lee wins Turing Award; 43%: phones FBI couldn't open because of crypto;

  • Quotable Quotes:
    • @adulau: To summarize the discussions of yesterday. All tor exit nodes are evil except the ones I operate.
    • @sinavaziri: Let's say a data center costs $1-2B. Then the TPU saved Google $15-30B of capex?
    • Vinton G. Cerf: While it would be a vast overstatement to ascribe all this innovation to genetic disposition, it seems to me inarguable that much of our profession was born in the fecund minds of emigrants coming to America and to the West over the past century.
    • Alan Bundy: AI systems are not just narrowly focused by design, because we have yet to accomplish artificial general intelligence, a goal that still looks distant. 
    • JamesBarney: Soo much this, just worked on a project that sacrificed reliability, maintainability, and scalability to use a real time database to deal with loads that were on the order of 70 values or 7 writes a second.
    • bobdole1234: 3.5x faster than CPU doesn't sound special, but when you're building inference capacity by the megawatt, you get a lot more of that 3.5x faster TPU inside that hard power constraint.
    • Eugenio Culurciello: As we have been predicting for 10 years, in SoC you can achieve > 10x more performance that current GPUs and > 100x more performance per watt.
    • Google: The TPU’s deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, ...) that help average throughput more than guaranteed latency. 
    • visarga: TPU excited me too at first, but when I realized that it is not related to training new networks (research) and is useful only for large scale deployment, I toned down my enthusiasm a little. 
    • Julian Friedman: Kube is being designed by system administrators who like distributed systems, not for programmers who want to focus on their apps.
    • shadowmint: Given what I've seen, I'd argue that clojure has an inherent complexity that results in poor code quality outcomes during the software maintenance cycle.
    • weberc2: I like Go, but it's not dramatically faster than Java. Any contest between the two of them will probably just be a back and forth of optimizations. They share pretty much the same upper bound.
    • adrianratnapala: All this means is that we should stop thinking of this stuff as RAM. Only the L1 cache is really RAM. Everything else is just a kind of fast, volatile, solid state disk that just happens to share an address space with the RAM.
    • pbreit: Getting a million users is infinitely harder than scaling a system to handle a million users. Most systems could run comfortably on a Raspberry Pi.
    • @sustrik: If you want your protocol to be fully reliable in the face of either peer shutting down, the terminal handshake has to be asymmetric. As we've seen above, TCP protocol has symmetric termination algorithm and thus can't, by itself, guarantee full reliability.
    • @damonedwards: Unit tests are critical for good dev, but aren't really ops concern. Integration tests are critical for good ops. Ops wants more int tests.
    • mannigfaltig: the brain appears to spend about 4.7 bits per synapse (26 discernible states, given the noisy computation environment of the brain); so it seems to be plenty enough for general intelligence. This could, of course, merely be a biological limit and on silicon more fine-grained weights might be the optimum.
    • marwanad: The main power of GraphQL is for client developers and lies in the decoupling it provides between the client and server and the ability to fulfill the client needs in a single round trip. This is great for mobile devices with slower networks.
    • kyleschiller: As a pretty good rule of thumb, a system that fails 1/nth of the time and has n opportunities to fail has ~.63 probability of failure, where n is more than ~10.
    • jjirsa: databases aren't where you want to have hipster tech. You want boring things that work. For me, Cassandra is the boring thing that works. 
    • @etherealmind: "rule #1 of Enterprise IT: easier to spend 10 million on equipment than 100k for a person. A third person would increase capacity by 30%"
    • @SwiftOnSecurity: “Just pick a good VPN” is like telling thirsty people to “go to a store and drink clear liquid.” They drank bleach, but at least you helped.
    • falsedan: There's 2 secrets to scaling to millions of users: 1. You aren't going to have millions of users so any work you do to support it is stopping you from delivering features that will make your existing 10 clients happier. 2. Write code that can be replaced (i.e. design for change). 
    • X86BSD: Have you tested running it on a FreeBSD box with ZFS? It has lz4 compression by default and makes such a great storage solution for PG. You get compression, snapshots, replication (not quite realtime but close), self healing, etc etc in a battled hardened and easy to manage filesystem and storage manager. I've found you can't beat ZFS and PG for most applications. Edge cases exist of course everywhere.

  • Worried about too much infrastructure? Only 2% of DNA codes for proteins, the other 98% codes for RNA. Harry Noller Lecture. Maybe lots of infrastructure is not a bad thing. One of they key differences in programming and biology is how in biology form completely determines function. Just amazing to watch in action: mRNA Translation (Advanced). Programming is the complete opposite.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Stuff The Internet Says On Scalability For March 31st, 2017

Fri, 03/31/2017 - 16:56

Hey, it's HighScalability time:

 

What lies beneath? Networks...of blood vessels. (Wellcome Image Awards)
If you like this sort of Stuff then please support me on Patreon.
  • 5000: node (150,000 pod) clusters in Kubernetes 1.6; 15 years: time to @spacex launch with a recycled rocket booster; 174 mbps: Internet speed in Dublin; 10 nm: Intel’s new Moore approved process; 30 minutes: to create Samsung's S8; 50 billion: of your cells replaced each day; 2 million: new red blood cells per second; 3dbm: attenuation of human body, same as a wall; 12: hours of tardis sounds; 350: pages to stop a bullet; 2: meters of DNA pack in a space .000006m wide; 

  • Quotable Quotes:
    • @swardley: Having met many "leaders" in technology & business, I wouldn't bet on the future survival of humanity. If anything AI might help the odds
    • Francis Pouliot: Any contentious hard fork of the Bitcoin blockchain shall be considered an alternative cryptocurrency (altcoin), regardless of the relative hashing power on the forked chain.
    • @coda: WhatsApp: 900M users, built w/ < 35 devs, using #erlang Krispy Kreme: 1004 locations, 3700 employees, original glazed is 190 #calories
    • @BenedictEvans: Still think it's interesting Instagram shifted emphasis from interests to friends. Is that a law of nature for social if you want scale?
    • @johnrobb: "each robot per thousand workers decreased employment by 6.2 workers and wages by 0.7 percent"
    • Alex Woodie: The Hadoop dream of unifying data and compute in a distributed manner has all but failed in a smoking heap of cost and complexity, according to technology experts and executives who spoke to Datanami.
    • @RichRogersIoT: "First you learn the value of abstraction, then you learn the cost of abstraction, then you are ready to engineer." - @KentBeck
    • @codemanship: Don't explain code quality to execs. Explain high cost of change. Explain slowing down of innovation. Explain longer cycle times.
    • @malwareunicorn: Bad malware pickup lines: Hey girl, I heard you like sandboxes. I would never try to escape yours ;)
    • dkhenry: The selling of data isn't the policy you need to fight. The monopoly power of ISP's is the problem you must push back on. 
    • @MaxWendkos: An SEO expert walks into a bar, bars, pub, tavern, public house, Irish pub, drinks, beer, alcohol
    • Barry Lampert: the point of Amazon isn't to offer a consumer the absolute lowest price possible; it's to offer the lowest price possible given the convenience that Amazon offers
    • Daniel Lemire: Let us make the statement precise: Most performance or memory optimizations are useless.
    • @sarahmei: People run into trouble with DRY because it doesn't tell you *what* not to repeat. People assume syntax, but it's actually concepts.
    • Dan Rayburn: China suffers from 9.2% transfer failure rate (similar to Malaysia, India and Brazil), and a high packet loss.  These two parameters have severe impact on content download time and overall performance.
    • Daniel Lemire: I submit to you that it is no accident if the StackOverflow list of top-paying programming languages is made of obscure languages. They are comparing the average of a niche against the average of a large population
    • For even more Quotable Quotes please click through to the main article.

  • For good WiFi you don't necessarily need one big powerful router bristling with antenna like a radiation mutated ant. 802.eleventy what? A deep dive into why Wi-Fi kind of suck and New Screen Savers (@20 min). You want a true mesh network (Plume). WiFi should whisper, use 5G to create pools of WiFi in each room so signals don't penetrate between rooms. Lots of little access points can automatically find a path through your house. Use a wired backhaul for best performance. Raw throughput isn't the best measure. How does it perform with many people using many devices? Roaming isn't always well supported. Consider how well the system hands-off devices as you walk through the house. 

  • BloomCON 2017 Videos are now available. You might like Honey, I Stole Your C2 [Command-and-control] Server: A dive into attacker infrastructure.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

How to speed up your MySQL with replication to in-memory database

Wed, 03/29/2017 - 16:56

Original article available at https://habrahabr.ru/company/mailru/blog/323870/

I’d like to share with you an article based on my talk at Tarantool Meetup(the video is in Russian, though). It’s a short story of why Mamba, one of the biggest dating websites in the world and the largest one in Russia, started using Tarantool. Why did we decide to busy ourselves with MySQL-to-Tarantool replication?

First, we had to migrate to MySQL 5.7 at some point, but this version didn’t have HandlerSocket that was being actively used on our MySQL 5.6 servers. We even contacted the Percona team — and they confirmed MySQL 5.6 is the last version to have HandlerSocket.

Second, we gave Tarantool a try and were pleased with its performance. We compared it against Memcached as a key-value store and saw the speed double from 0.6 ms to 0.3 ms on the same hardware. In relative terms, Tarantool’s twice as fast as Memcached. In absolute terms, it’s not that cool, but still impressive.

Third, we wanted to keep the whole existing architecture. There’s a MySQL master server and its slaves — we didn’t want to change anything in this structure. Can MySQL 5.6 slaves with HandlerSocket be replaced with something else without having to make significant architectural changes?

We learned that the Mail.Ru Group team has a replicator they created for their own purposes. The idea of replicating data from MySQL to Tarantool belongs to them. We asked the team to share the source code, which they did. We had to rewrite the code, though, since it worked with MySQL 5.1 and Tarantool 1.5, not 1.7. The replicator uses libslave, an open-source solution for reading events from a MySQL master server, and is built statically without any of MySQL’s system libraries. It’s been open-sourcedunder the BSD license, so anyone can use it for free.

Replication constraints
Categories: Architecture

Sponsored Post: ButterCMS, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Wed, 03/29/2017 - 16:56

Who's Hiring? 
  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 
Fun and Informative Events
  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Advertise your event here!
Cool Products and Services
  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-based CMS that seamlessly drops into your app or website. Great for blogs, dynamic pages, knowledge bases, and more. Butter works with any language/framework including Ruby, Rails, Node.js, .NET, Python, Django, Flask, React, Angular, Go, PHP, Laravel, Elixir, Phoenix, and Meteor.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Faster Networks + Cheaper Messages => Microservices => Functions => Edge

Mon, 03/27/2017 - 17:25

When Adrian Cockroft—the guy who helped put the loud in Cloud through his energetic evangelism of Cloud Native and Microservice architectures—talks about what’s next, it pays to listen. And you can listen, here’s a fascinating forward looking talk he gave at microXchg 2017: Shrinking Microservices to Functions. It’s typically Cockroftian: understated, thoughtful, and full of insight drawn from experience.

Adrian makes a compelling case that the same technology drivers, faster networking and cheaper messaging, that drove the move to Microservices are now driving the move to Functions.

The payoffs are all those you’ve no doubt heard about Serverless for some time, but Adrian develops them in an interesting way. He traces how architectures have evolved over time. Take a look at my gloss of his talk for more details.

What’s next after Functions? Adrian talks about pushing Lambda functions to the edge. A topic I’m excited about and have been interested in for sometime, though I didn’t quite see it playing out like this.

Datacenters disappear. Functions are not running in an AWS region anymore, code is placed near the customer using a CDN at CDN endpoints. Now you have a fully distributed, at the edge, low latency, milliseconds from the customer way of running code. Now you can build architectures that are partly in the datacenter, partly at the edge, and partly at the customer premises. And since this is AWS, it’s all, of course, built around Lambda. AWS Greengrass and Snowball Edge are peeks into what the future might look like.

There’s a hidden tension here. Once you put code at the edge you violate two of Lambda’s key assumptions: functions are composed using scalable backend services; low latency messaging. The edge will have a high latency path back to services in the datacenter, so how do you make a function based distributed application at the edge? Does edge computing argue for a more retro architecture with fewer messages back to a more monolithic core?

Or does edge computing require something completely different? Here’s one thought as to what that something completely different might look like: Datanet: A New CRDT Database That Let's You Do Bad Bad Things To Distributed Data.

Now, let’s see the future by first taking a tour of the past….

From Monoliths, to Microservices, to Functions
Categories: Architecture

Stuff The Internet Says On Scalability For March 24th, 2017

Fri, 03/24/2017 - 16:56

Hey, it's HighScalability time:

 This is real and oh so eerie. Custom microscope takes a 33 hour time lapse of a tadpole egg dividing.
If you like this sort of Stuff then please support me on Patreon.
  • 40Gbit/s: indoor optical wireless networks; 15%: energy produced by wind in Europe; 5: new tasty particles; 2000: Qubits are easy; 30 minutes: flight time for electric helicopter; 42.9%: of heathen StackOverflowers prefer tabs;

  • Quotable Quotes:
    • @RichRogersIoT: "Did you know? The collective noun for a group of programmers is a merge-conflict." - @omervk
    • @tjholowaychuk: reviewed my dad's company AWS expenses, devs love over-provisioning, by like 90% too, guess that's where "serverless" cost savings come in
    • @karpathy: Nature is evolving ~7 billion ~10 PetaFLOP NI agents in parallel, and has been for ~10M+s of years, in a very realistic simulator. Not fair.
    • @rbranson: This is funny, but legit. Production software tends to be ugly because production is ugly. The ugliness outpaces our ability to abstract it.
    • @joeweinman: @harrietgreen1 : Watson IoT center opened in Munich... $200 million dollar investment; 1000 engineers #ibminterconnect
    • David Gerard: This [IBM Blockchain Service] is bollocks all the way down.
    • digi_owl: Sometimes it seems that the diff between a CPU and a cluster is the suffix put on the latency times.
    • Scott Aaronson: I’m at an It from Qubit meeting at Stanford, where everyone is talking about how to map quantum theories of gravity to quantum circuits acting on finite sets of qubits, and the questions in quantum circuit complexity that are thereby raised.
    • Founder Collective: Firebase didn’t try to do everything at once. Instead, they focused on a few core problems and executed brilliantly. “We built a nice syntax with sugar on top,” says Tamplin. “We made real-time possible and delightful.” It is a reminder that entrepreneurs can rapidly add value to the ecosystem if they really focus.
    • Elizabeth Kolbert: Reason developed not to enable us to solve abstract, logical problems or even to help us draw conclusions from unfamiliar data; rather, it developed to resolve the problems posed by living in collaborative groups. 
    • Western Union: the ‘telephone’ has too many shortcomings to be seriously considered as a means of communication.
    • Arthur Doskow: being fair, being humane may cost money. And this is the real issue with many algorithms. In economists’ terms, the inhumanity associated with an algorithm could be referred to as an externality. 
    • Francis: The point is that even if GPUs will support lower precision data types exclusively for AI, ML and DNN, they will still carry the big overhead of the graphics pipeline, hence lower efficiency than an FPGA (in terms of FLOPS/WATT). The winner? Dedicated AI processors, e.g. Google TPU
    • James Glasnapp: When we move out of the physical space to a technological one, how is the concept of a “line” assessed by the customer who can’t actually see the line? 
    • Frank: On the other hand, if institutionalized slavery still existed, factories would be looking at around $7,500 in annual costs for housing, food and healthcare per “worker”.
    • Baron Schwartz: If anyone thought that NoSQL was just a flare-up and it’s died down now, they were wrong...In my opinion, three important areas where markets aren’t being satisfied by relational technologies are relational and SQL backwardness, time series, and streaming data. 
    • CJefferson: The problem is, people tell me that if I just learn Haskell, Idris, Closure, Coffescript, Rust, C++17, C#, F#, Swift, D, Lua, Scala, Ruby, Python, Lisp, Scheme, Julia, Emacs Lisp, Vimscript, Smalltalk, Tcl, Verilog, Perl, Go... then I'll finally find 'programming nirvana'.
    • @spectatorindex: Scientists had to delete Urban Dictionary's data from the memory of IBM's Watson, because it was learning to swear in its answers.
    • Animats: [Homomorphically Encrypted Deep Learning] is a way for someone to run a trained network on their own machine without being able to extract the parameters of the network. That's DRM.
    • Dino Dai Zovi: Attackers will take the least cost path through an attack graph from their start node to their goal node.
    • @hshaban: JUST IN: Senate votes to repeal web privacy rules, allowing broadband providers to sell customer data w/o consent including browsing history
    • KBZX5000: The biggest problem you face, as a student, when taking a programming course at a University level, is that the commercially applicable part of it is very limited in scope. You tend to become decent at writhing algorithms. A somewhat dubious skill, unless you are extremely gifted in mathematics and / or somehow have access to current or unique hardware IP's (IP as in Intellectual Property).
    • Brian Bailey: The increase in complexity of the power delivery network (PDN) is starting to outpace increases in functional complexity, adding to the already escalating costs of modern chips. With no signs of slowdown, designers have to ensure that overdesign and margining do not eat up all of the profit margin.
    • rbanffy: Those old enough will remember the AS/400 (now called iSeries) computers map all storage to a single address space. You had no disk - you had just an address space that encompassed everything and an OS that dealt with that.
    • @disruptivedean: Biggest source of latency in mobile networks isn't milliseconds in core, it's months or years to get new cell sites / coverage installed
    • Greg Ferro: Why Is 40G Ethernet Obsolete? Short Answer: COST. The primary issue is that 40G Ethernet uses 4x10G signalling lanes. On UTP, 40G uses 4 pairs at 10G each. 
    • @adriaanm: "We chose Scala as the language because we wanted the latest features of Spark, as well as [...] types, closures, immutability [...]"Adriaan Moors added,
    • ajamesm: There's a difference between (A) locking (waiting, really) on access to a critical section (where you spinlock, yield your thread, etc.) and (B) locking the processor to safely execute a synchronization primitive (mutexes/semaphores).
    • @evan2645: "Chaos doesn't cause problems, it reveals them" - @nora_js #SREcon17Americas #SRECon17
    • chrissnell: We've been running large ES clusters here at Revinate for about four years now. I've found the sweet spot to be about 14-16 data nodes, plus three master-only nodes. Right now, we're running them under OpenStack on top of our own bare metal with SAS disks. It works well but I have been working on a plan to migrate them to live under Kubernetes like the rest of our infrastructure. I think the answer is to put them in StatefulSets with local hostPath volumes on SSD.
    • @beaucronin: Major recurring theme of deep learning twitter is how even those 100% dedicated to the field can't keep up with progress.
    • Chris McNab: VPN certificates and keys are often found within and lifted from email, ticketing, and chat services.
    • @bodil: And it took two hours where the Rust version has taken three days and I'm still not sure it works.
    • azirbel: One thing that's generalizable (though maybe obvious) is to explicitly define the SLAs for each microservice. There were a few weeks where we gave ourselves paging errors every time a smaller service had a deploy or went down due to unimportant errors.
    • bigzen: I'm worn out on articles dissing the performance of SQL databases without quoting any hard numbers and then proceeding to replace the systems with no thanks of development in the latest and great tech. I have nothing against spark, but I find it very hard to believe that alarm code is now readable than SQL. In fact, my experience is just the opposite.
    • jhgg: We are experimenting with webworkers to power a very complicated autocomplete and scoring system in our client. So far so good. We're able to keep the UI running at 60fps while we match, score and sort results in a web-worker.
    • DoubleGlazing: NoSQL doesn't reduce development effort. What you gain from not having to worry about modifying schemas and enforcing referential integrity, you lose from having to add more code to your app to check that a DB document has a certain value. In essence you are moving responsibility for data integrity away from the DB and in to your app, something I think is quite dangerous.
    • Const-me: Too bad many computer scientists who write books about those algorithms prefer to view RAM in an old-fashioned way, as fast and byte-addressable.
    • Azur: It always annoys me a bit when tardigrades are described as extremely hardy: they are not. It is ONLY in the desiccated, cryptobiotic, form they are resistant to adverse conditions.
    • rebootthesystem: Hardware engineers can design FPGA-based hardware optimized for ML. A second set of engineers then uses these boards/FPGA's just as they would GPU's. They write code in whatever language to use them as ML co-processors. This second group doesn't have to be composed of hardware engineers. Today someone using a GPU doesn't have to be a hardware engineer who knows how to design a GPU. Same thing.

  • There should be some sort of Metcalfe's law for events. Maybe: the value of a platform is proportional to the square of the number of scriptable events emitted by unconnected services in the system. CloudWatch Events Now Supports AWS Step Functions as a Target@ben11kehoe: This is *really* useful: Automate your incident response processes with bulletproof state machines #aws

  • Cute faux O'Reilly book cover. Solving Imaginary Scaling Issues.

  • Intel's Optane SSD is finally out, though not quite meeting it's initial this will change everything promise, it still might change a lot of things. Intel’s first Optane SSD: 375GB that you can also use as RAM. 10x DRAM latency. 1/1000 NAND latency. 2400MB/s read, 2000MB/s write. 30 full-drive writes per day. 2.5x better density. $4/GB (1/2 RAM cost). 1.5TB capacity. 500k mixed random IOPS. Great random write response. Targeted at power users with big files, like databases. NDAs are still in place so there's more to learn later. PCPerspective: comparing a server with 768GB of DRAM to one with 128GB of DRAM combined with a pair of P4800X's, 80% of the transactions per second were possible (with 1/6th of the DRAM). More impressive was that matrix multiplication of the data saw a 1.1x *increase* in performance. This seems impossible, as Optane is still slower than DRAM, but the key here was that in the case of the DRAM-only configuration, half of the database was hanging off of the 'wrong' CPU.  foboz1: For anyone think that this a solution looking for a problem, think about two things: Big Data and mobile/embedded. Big Data has an endless appetite for large quantities for memory and fast storage; 3D XPoint plays into the memory hierarchy nicely. At the extreme other end of the scale, it may be fast enough to obviate the need for having DRAM+NAND in some applications. raxx7: And 3D XPoint isn't free of limitations yet. RAM has 50-100 ns latency, 50 GB/s bandwidth (128 bit interface) and unlimited write endurance. If 3D XPoint NVDIMM can't deliver this, we'll still need to manage the difference between RAM and 3D XPoint NVDIMM. zogus: The real breakthrough will come, I think, when the OS and applications are re-written so that they no longer assume that a computer's memory consists of a small, fast RAM bank and a huge, slow persistent set of storage--a model that had held true since just about forever. VertexMaster: Given that DRAM is currently an order of magnitude faster (and several orders vs this real-world x-point product) I really have a hard time seeing where this fits in. sologoub: we built a system using Druid as the primary store of reporting data. The setup worked amazingly well with the size/cardinality of the data we had, but was constantly bottlenecked at paging segments in and out of RAM. Economically, we just couldn't justify a system with RAM big enough to hold the primary dataset...I don't have access to the original planning calculations anymore, but 375GB at $1520 would definitely have been a game changer in terms of performance/$, and I suspect be good enough to make the end user feel like the entire dataset was in memory.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Stuff The Internet Says On Scalability For March 17th, 2017

Fri, 03/17/2017 - 17:08

Hey, it's HighScalability time:

 

Can it be a coincidence trapping autonomous cars is exactly how demons are trapped on Supernatural?
If you like this sort of Stuff then please support me on Patreon.
  • billion billion: exascale operations per second; 250ms: connection time saved by zero round trip time resumption; 800 Million: tons of prey eaten by spiders; 90%: accuracy of quantum computer recognizing trees; 80 GB/s: S3 across 2800 simultaneous functions;

  • Quotable Quotes:
    • @GossiTheDog: Here's something to add to your security threat model: backups. Why steal live data and when you can drive away with exact replica?
    • @ThePublicSquare: "California produces 160% of its 1990 manufacturing, but with just 60% of the workers." -@uclaanderson economist Jerry Nickelsburg
    • @rbranson: makes total sense. I have a friend (who is VC-backed) that has stuff in Azure, GCloud, and AWS to maximize the free credits.
    • @AndrewYNg: If not for US govt funding (DARPA, NSF), US wouldn't be an AI leader today. Proposed cuts to science is big step in wrong direction.
    • @CodeWisdom: "To understand a program you must become both the machine and the program." - Alan Perlis 
    • @codemanship: What does it take to achieve Continuous Delivery? 1. Continuous testing. e.g., Google have 4.2M automated tests, run avg of 35x a day
    • @sebastianstadil: Azure Storage services are down. They really are doing everything like AWS.
Categories: Architecture

Architecture of Probot - My Slack and Messenger Bot for Answering Questions

Wed, 03/15/2017 - 17:25

I programmed a thing. It’s called Probot. Probot is a quick and easy way to get high quality answers to your accounting and tax questions. Probot will find a real live expert to answer your question and handle all the details. You can get your questions answered over Facebook Messenger, Slack, or the web. Answers start at $10. That’s the pitch.

Seems like a natural in this new age of bots, doesn’t it? I thought so anyway. Not so much (so far), but more on that later.

I think Probot is interesting enough to cover because it’s a good example of how one programmer--me---can accomplish quite a lot using today’s infrastructure.

All this newfangled cloud/serverless/services stuff does in fact work. I was able to program a system spanning Messenger, Slack, and the web, in a way that is relatively scalabile, available, and affordable, while requiring minimal devops.

Gone are the days of worrying about VPS limits, driving down to a colo site to check on a sick server, or even worrying about auto-scaling clusters of containers/VMs. At least for many use cases.

Many years of programming experience and writing this blog is no protection against making mistakes. I made a lot of stupid stupid mistakes along the way, but I’m happy with what I came up with in the end.

Here’s how Probot works....

Platform
Categories: Architecture

Sponsored Post: Aerospike, Loupe, Clubhouse, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net

Tue, 03/14/2017 - 16:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Your event here!
Cool Products and Services
  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-first CMS that quickly integrates into your app. Rapidly build CMS-powered experiences in any programming language. Great for blogs, marketing pages, knowledge bases, and more. Butter plays well with Ruby, Rails, Node.js, Go, PHP, Laravel, Python, Flask, Django, and more.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Stuff The Internet Says On Scalability For March 10th, 2017

Fri, 03/10/2017 - 17:56

Hey, it's HighScalability time:

 

Darknet is 4x more resilient than the Internet. An apt metaphor? (URV)
If you like this sort of Stuff then please support me on Patreon.
  • > 5 9s: Spanner availability; 200MB: random access from DNA storage; 215 Pbytes/gram: DNA storage; 287,024: Google commits to open source; 42: hours of audio gold; 33: minutes to get back into programming after interruption; 12K: Chinese startups started per day; 35 million: tons of good shipped under Golden Gate Bridge; 209: mph all-electric Corvette; 500: Disney projects in the cloud; 40%: rise in CO2; 

  • Quoteable Quotes:
    • Marc Rogers: Anything man can make man can break
    • @manupaisable: 10% of machines @spotify rebooted every hour because of defunct #docker - war stories by @i_maravic @qconlondon
    • @robertcottrell: “the energy cost of each bitcoin transaction is enough to power 3.17 US households for a day”
    • Eric Schmidt: We put $30 billion into this platform. I know this because I approved it. Why replicate that?
    • dim: It uses p30 technology. Just basic things, gliders and lightweight spaceships. Basically, the design goes top-down: At the very top, there's the clock. It is a 11520 period clock. Note that you need about 10.000 generations to ensure the display is updated appropriately, but the design should still be stable with a clock of smaller period (about 5.000 or so - the clock needs to be multiple of 60).
    • Luke de Oliveira: Most people in AI forget that the hardest part of building a new AI solution or product is not the AI or algorithms — it’s the data collection and labeling. Standard datasets can be used as validation or a good starting point for building a more tailored solution.
    • @violetblue: Did a lot of people not know that the CIA is a spy agency?
    • @viktorklang: Async is not about *performance*—it is about *scalability*. Let your friends know
    • stillsut: The difference is in the old days, you adapted to computer. Now, computer must adapt to you.
    • Eric Brewer: Spanner uses two-phase commit to achieve serializability, but it uses TrueTime for external consistency, consistent reads without locking, and consistent snapshots.
    • Emily Waltz: Nomura’s molecular robot differs in that it is composed entirely of biological and chemical components, moves like a cell, and is controlled by DNA.
    • Chris Anderson: Most of the devices in our life, from our cars to our homes, are “entropic,” which is to say they get worse over time. Every day they become more outmoded. But phones and drones are “negentropic” devices. Because they are connected, they get better, because the value comes from the software, not hardware
    • William Dutton: Most people using the internet are actually more social than those who are not using the internet
    • @swardley:  ... by 2016, you should have dabbled / learn / tested serverless.  "Go IaaS" or "build our biz as a cloud" in 2017 is #facepalm
    • Bradford Cross: The incompetent segment: the incompetent segment isn’t going to get machine learning to work by using APIs. They are going to buy applications that solve much higher level problems. Machine learning will just be part of how they solve the problems.
    • @denormalize: What do we want? Machine readable metadata! When do we want it? ERROR Line 1: Unexpected token `
    • @Ocramius: "And we should get rid of users: users are not pure, since they modify the state of our system" #confoo
    • Morning Paper: The most important overarching lesson from our study is this: a single file-system fault can induce catastrophic outcomes in most modern distributed storage systems. 
    • Linus Torvalds: And if the DRM "maintenance" is about sending me random half-arsed crap in one big pull, I'm just not willing to deal with it. This is like the crazy ARM tree used to be.
    • Shaun McCormick: Technical Debt is a Positive and Necessary Step in software engineering
    • @tdierks: Hello, my name is Tim. I'm a lead at Google with over 30 years coding experience and I need to look up how to get length of a python string.
    • @codinghorror: I colocated a $600 Ali Express mini pc for $15/month and it is 2x faster than "the cloud"
    • @antirez: "Group chat is like being in an all-day meeting with random participants and no agenda".
    • @sriramhere: Wise man once wrote "As flexible as it is, compute in AWS is optimized for the old capex world." @sallamar
    • @wattersjames: AI will come to your company carefully disguised as a lot of ETL and data-pipeline work...
    • ceejayoz: Lambda's billed in 100 millisecond increments. EC2 servers are billed in one hour increments. If you need short tasks that run in bursty workloads, Lambda's (potentially) a no-brainer.
    • @codinghorror: we have not found bare metal colocation to be difficult, with one exception: persistent file storage. That part, strangely, is quite hard.
    • @jbeda: Lesson from 10 years at Google: this is true until it isn't. Sometimes you *can* build a better mouse trap.
    • jfoutz: I agree. It's genius in a Lex Luthor kind of way. If I understood the full scope of the application, I like to think i'd decline to work on that. It's easy to imagine engineers working on small parts of the system, and never really connecting the dots that the whole point is to evade law enforcement.
    • dsr_:  It's harder (but not impossible) to have complete service lossage like this [Slack] in a federated protocol. That's why you didn't hear about the great email collapse of 2006.
    • throw_away_777: I agree that neural nets are state-of-the-art and do quite well on certain types of problems (NLP and vision, which are important problems). But a lot of data is structured (sales, churn, recommendations, etc), and it is so much easier to train an xgboost model than a neural net model. 
    • @GossiTheDog: #Vault7 CIA - Wiki that Wikileaks released is/was on hosted on DEVLAN, the CIA's "dirty" development network - a major architecture error.
    • Alison Gopnik: new studies suggest that both the young and the old may be especially adapted to receive and transmit wisdom. We may have a wider focus and a greater openness to experience when we are young or old than we do in the hurly-burly of feeding, fighting and reproduction that preoccupies our middle years.
    • @pierre: Wow, audacious to say the least. Intentionally flagging authorities to mislead them. It's like the VW emissions code
    • Joan Gamell: Starting with the obvious: the CIA uses JIRA, Confluence and git. Yes, the very same tools you use every day and love/hate. 
    • Chris Baraniuk: The networks of genes in each animal is a bit like the network of neurons in our brains, which suggests they might be "learning" as they go
    • futurePrimitive: Managers seem to think that programming is typing. No. Programming is *thinking*. The stuff that *looks* like work to a manager (energetic typing) only happens after the hard work is done silently in your head.
    • @danielbryantuk: "There is no such thing as a 'stateless' architecture. It's just someone else's problem" @jboner #qconlondon
    • Platypus: There's no panacea for vendor lock-in. Not even open source, but open source alone gets you further than any number of standards that don't cover what really matters or vendor-provided tools that might go away at any moment. It's the first and best tool for dealing with lock-in, even if it's not perfect. 
    • @tpuddle: @cliff_click talk at #qconlondon about fraud detection in financial trades. Searching 1 billion trades a day "is not that big". !
    • @charleshumble: "Something I see in about 95% of the trading data sets is there are a small number of bad guys hammering it." Cliff Click #qconlondon

  • You may not be able to hear doves cry, but you can listen to machines talk. Elevators to be precise. Watch them chat away as they selflessly shuttle to and fro. Yes, it is as exciting as you might imagine. Though probably not very different than the interior dialogue of your average tool.

  • It used to be that winners wrote history. Now victors destroy data. Terabytes of Government Data Copied

  • Battling legacy code seems to be the number one issue on Stack Overflow, as determined by top books mentioned on Stack Overflow. Not surprising. What was surprising is what's not on the list: algorithm books. Books on the craft of programming took top honors. Gratifying, but at odds with current interviewing dogma. The top 10 books: Working Effectively with Legacy Code; Design Patterns; Clean Code; Java concurrency in practice; Domain-driven Design; JavaScript; Patterns of Enterprise Application Architecture;  Code Complete; Refactoring; Head First Design Patterns.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Part 4 of Thinking Serverless —  Addressing Security Issues

Mon, 03/06/2017 - 17:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1 and 2 and 3

This post is the last of a four-part series of that will dive into developing applications in a serverless way. These insights are derived from several years working with hundreds of developers while they built and operated serverless applications and functions.

The platform was the serverless platform from Iron.io but these lessons can also apply to AWS LambdaGoogle Cloud FunctionsAzure Functions, and IBM’s OpenWhisk project.

Arriving at a good definition of cloud IT security is difficult especially in the context of highly scalable distributed systems like those found in serverless platforms. The purpose of this post is to not to provide an exhaustive set of principles but instead highlight areas that developers, architects, and security officers might wish to consider when evaluating or setting up serverless platforms.

Serverless Processing — Similar But Different

High-scale task processing is certainly not a new concept in IT as it has parallels that date back to the days of job processing on mainframes. The abstraction layer provided by serverless process — in combination with large-scale cloud infrastructure and advanced container technologies — does, however, bring about capabilities that are markedly different than even just a few years ago.

By plugging into an serverless computing platforms, developers do not need to provision resources based on current or anticipated loads or put great effort into planning for new projects. Working and thinking at the task level means that developers are not paying for resources they are not using. Also, regardless of the number of projects in production or in development, developers using serverless processing do not have to worry about managing resources or provisioning systems.

While serving as Iron.io’s security officer, I answered a number of security questionnaires from customers. One common theme is that they were all in need of a serious update to bring them forward into this new world. Very few had any accommodation for cloud computing much less serverless processing.

Most questionnaires still viewed servers as persistent entities needing constant care and feeding. They presumed physical resources as opposed to virtualization, autoscaling, shared resources, and separation of concerns. Their questions lack differentiation between data centers and development and operation centers. A few still asked for the ability to physically inspect data centers which is, by and large, not really an option these days. And very few addressed APIs, logging, data persistence, or data retention.

The format of the sections below follows the order found in many of these security questionnaires as well as several cloud security policies. The order has been flipped a bit to start with areas where developers can have an impact. Later sections will address platform and system issues which teams will want to be aware of but are largely in the domain of serverless platforms and infrastructure providers.

Security Topics

Data Security
Categories: Architecture

Stuff The Internet Says On Scalability For March 3rd, 2017

Fri, 03/03/2017 - 17:56

Hey, it's HighScalability time:

 

Only 235 trillion miles away. Engage. (NASA)
If you like this sort of Stuff then please support me on Patreon.
  • $5 billion: Netflix spend on new content; $1 billion: Netflix spend on tech; 10%: bounced BBC users for every additional second page load; $3.5 billion: Priceline Group ad spend; 12.6 million: hours streamed by Pornhub per day; 1 billion: hours streamed by YouTube per day; 38,000 BC: auroch carving; 5%: decrease in US TV sets;

  • Quotable Quotes:
    • Fahim ul Haq: Rule 1: Reading High Scalability a night before your interview does not make you an expert in Distributed Systems.
    • @Pinboard: Root cause of outage: S3 is actually hosted on Google Cloud Storage, and today Google Cloud Storage migrated to AWS
    • Matthew Green: ransomware currently is using only a tiny fraction of the capabilities available to it. Secure execution technologies in particular represent a giant footgun just waiting to go off if manufacturers get things only a little bit wrong.
    • dsr_: This [S3 outage] is analogous to "we needed to fsck, and nobody realized how long that would take".
    • tptacek: Uber isn't the driver's employer. Uber is a vendor to the driver. The driver is complaining that its vendor made commitments, on which the driver depended, and then reneged. The driver might be right or might be wrong, but in no discussion with a vendor in the history of the Fortune 500 has it ever been OK for the vendor to accuse their customer of "not taking responsibility for their own shit".
    • @felixsalmon: Hours of video served per day: Facebook: 100 million Netflix: 116 million YouTube: 1 billion
    • @Geek_Manager: "Everybody wants to write reusable code. Nobody wants to reuse anyone else's code." @eryno #leaddev
    • @ellenhuet: a private South Bay high school 1) having a growth fund and 2) being early in Snap is the most silicon valley thing ever
    • @_ginger_kid: I speak from experience as a cash strapped startup CTO. Would love to be multi region, just cannot justify it. V hard.
    • @Objective_Neo: SpaceX, $12 billion valuation: Launches 70m rockets into space and lands them safely. Snapchat, $20 billion valuation: Rainbow Filters.
    • @neil_conway: (2/4): MTTR (repair time) is AT LEAST as important as MTBF in determining service uptime and often easier to improve.
    • John Hagel: we’re likely to see a new category of gig work emerge – let’s call it “creative opportunity targeting.”...we anticipate that more and more of the workforce will be pulled into this arena of creative gig workgroups
    • Seyi Fabode: The constraint is that the broker model, even with new technology, is not value additive. 
    • Robert Kolker: From his experience with the Gary police, Hargrove learned the first big lesson of data: If it’s bad news, not everyone wants to see the numbers
    • gamache: A piece of hard-earned advice: us-east-1 is the worst place to set up AWS services. You're signing up for the oldest hardware and the most frequent outages.
    • Dan Sperber: we each have a great many mental devices that contribute to our cognition. There are many subsystems. Not two, but dozens or hundreds or thousands of little mechanisms that are highly specialized and interact in our brain. Nobody doubts that something like this is the case with visual perception. I want to argue that it’s also the case for the so-called central systems, for reasoning, for inference in general.
    • Joaquin Quiñonero Candela: Facebook today cannot exist without AI. Every time you use Facebook or Instagram or Messenger, you may not realize it, but your experiences are being powered by AI.
    • alicebob: Sometimes keeping things simple is worth more than keeping things globally available.
    • Sveta Smirnova: Both MySQL and PostgreSQL on a machine with a large number of CPU cores hit CPU resources limits before disk speed can start affecting performance.
    • @jamesiry: Using many $100,000’s of compute, Google collided a known weak hash. Meanwhile one botched memcpy leaked the Internet’s passwords.
    • @david4096: teaching engineers to say no is cheaper than Haskell
    • @cgvendrell: #AI will be dictated by Google. They're 1 order of magnitude ahead, they understood key = chip level of stack (TPU) + training data @chamath
    • @antirez: There are tons of more tests to do, but the radix trees could replace most hash tables inside Redis in the future: faster & smaller.
    • DHH: So it remains mostly our fault. Our choice, our dollars. Every purchase a vote for an ever more dysfunctional future. We will spend our way into the abyss.
    • @jamesurquhart: This is why I write about data stream processing and serverless—lessons I learned at @SOASTAInc about the value of real time and BizOps.
    • twakefield: The brilliance of open sourcing Borg (aka Kubernetes) is evident in times like these. We[0] are seeing more and more SaaS companies abstract away their dependencies on AWS or any particular cloud provider with Kubernetes.
    • flak: password hashes aren’t broken by cryptanalysis. They’re rendered irrelevant by time (hardware advancements). What was once expensive is now cheap, what was once slow is now fast. The amount of work hasn’t been reduced, but the difficulty of performing it has.
    • @darkuncle: biz decisions again ... gotta weigh cost/frequency of AWS single-region downtime vs. cost/complexity of multi-region & GSLB.
    • @nantonius: Reducing network latencies are a key enabler for moving away from monolith towards serverless. @adrianco:
    • tbrowbdidnso: These companies that all run their own hardware exclusively are telling everyone that it's stupid to run your own hardware... Why are we listening?
    • jasonhoyt: "People make mistakes all the time...the problem was that our systems that were designed to recognize and correct human error failed us." 
    • @chuhnk: Bob: Service Discovery is a SPOF. You should build async services. Me: How do you receive messages? Bob: A message bus Me: ...
    • @JoeEmison: These articles on serverless remind me of articles on NoSQL from a few years ago. FaaS may have low adoption b/c of the req'd architectures.
    • @Jason: We have 30-60% open rates for http://inside.com  emails vs 1% for app downloads!
    • @adrianco: Split brain syndrome: half your brain thinks message buses are reliable. Other half is wondering how to recover from split brain syndrome.
    • @dbrady: The older I get, the less I care about making tech decisions right and the more I care about retaining the ability to change a wrong one.
    • @littleidea: "Automation code, like unit test code, dies when the maintaining team isn’t obsessive about keeping the code in sync with the codebase."
    • @adulau: I don't ask for bug bounties, fame, cash or even tshirt. I just want a good security point of contact to fix the issues.
    • StorageMojo: most of the SSD vendors don’t make AFAs [all flash arrays]. They have little to lose by pushing NVMe/PCIe SSDs for broad adoption.
    • cookiecaper: I mean, that's not really AWS's problem, is it? Outages happen. If you have a mission-critical service like health care, you really shouldn't write systems with single points of failure like this, especially not systems that depend on something consumer-grade like S3.
    • plgeek: To me his main point is there is a spectrum of what you might consider evidence/proof. However, in Software Engineering their have been low standards set, and it's really not acceptable to continue with low standards. He is not saying the only sort of acceptable evidence is a double blind study.
    • n00b101: I asked an Intel chip designer about this and his opinion was that asynchronous processors are a "fantasy." His reasoning was that an asynchronous chip would still need to synchronize data communication within the chip. Apparently global clock synchronization accounts for about 20% of the power usage of a synchronous chip. In the asynchronous case, if you had to synchronize every communication, then the cost of communication is doubled.

  • Anti-virus software uses fingerprinting as a detection technique. Surprise, nature got there first. Update: CRISPR. Bacteria grab pieces of DNA from viri and store them. This lets them recognize a virus later. When a virus enters a bacteria the bacteria will send out enzymes to combat the invader. Usually the bacteria dies. Sometimes the bacteria wins. The bacteria sends out enzymes to find stray viruses and cut the enemy DNA into small pieces. Those enzymes take the little bits of DNA and splice them into the bacteria's own DNA. DNA is used as a memory device. Next time the virus shows up the bacteria creates molecular assassins that contain a copy of the virus DNA. If there's a pattern match then kill it. The protein looks something like a clam shell. It has a copy of the virus DNA. Whenever it bumps into some virus DNA it pulls apart the DNA, unzips it, reads it, if it's not the right one it moves on. If the RNA has the same sequence then molecular blades come out and chop. Like smart scissors. This is CRISPR.

  • Videos from microXchg 2017 are now available

  • A natural disaster occurred. S3 went down. Were you happy with how your infrastructure responded? @spire was. Mitigating an AWS Instance Failure with the Magic of Kubernetes: "Kubernetes immediately detected what was happening. It created replacement pods on other instances we had running in different availability zones, bringing them back into service as they became available. All of this happened automatically and without any service disruption, there was zero downtime. If we hadn’t been paying attention, we likely wouldn’t have noticed everything Kubernetes did behind the scenes to keep our systems up and running." How do you make this happen?: Distribute nodes across multiple AZs; Nodes should have capacity to handle at least one node failure; Use at least 2 pods per deployment; Use readiness and liveness probes.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Getting Started with Lyft Envoy for Microservices Resilience

Wed, 03/01/2017 - 17:57

This is a guest repost by Flynn at datawireio on Envoy, a Layer 7 communications bus, used throughout Lyft's service-oriented architecture.

Using microservices to solve real-world problems always involves more than simply writing the code. You need to test your services. You need to figure out how to do continuous deployment. You need to work out clean, elegant, resilient ways for them to talk to each other.

A really interesting tool that can help with the “talk to each other” bit is Lyft’s Envoy: “an open source edge and service proxy, from the developers at Lyft.” (If you’re interested in more details about Envoy, Matt Klein gave a great talk at the 2017 Microservices Practitioner Summit.)

Envoy Overview

It might feel odd to see us call out something that identifies itself as a proxy – after all, there are a ton of proxies out there, and the 800-pound gorillas are NGINX and HAProxy, right? Here’s some of what’s interesting about Envoy:

  • It can proxy any TCP protocol.
  • It can do SSL. Either direction.
  • It makes HTTP/2 a first class citizen, and can translate between HTTP/2 and HTTP/1.1 (either direction).
  • It has good flexibility around discovery and load balancing.
  • It’s meant to increase visibility into your system.
    • In particular, Envoy can generate a lot of traffic statistics and such that can otherwise be hard to get.
    • In some cases (like MongoDB and Amazon RDS) Envoy actually knows how to look into the wire protocol and do transparent monitoring.
  • It’s less of a nightmare to set up than some others.
  • It’s a sidecar process, so it’s completely agnostic to your services’ implementation language(s).

(Envoy is also extensible in some fairly sophisticated — and complex — ways, but we’ll dig into that later — possibly much later. For now we’re going to keep it simple.)

Being able to proxy any TCP protocol, including using SSL, is a pretty big deal. Want to proxy Websockets? Postgres? Raw TCP? Go for it. Also note that Envoy can both accept and originate SSL connections, which can be handy at times: you can let Envoy do client certificate validation, but still have an SSL connection to your service from Envoy.

Of course, HAProxy can do arbitrary TCP and SSL too — but all it can do with HTTP/2 is forward the whole stream to a single backend server that supports it. NGINX can’t do arbitrary protocols (although to be fair, Envoy can’t do e.g. FastCGI, because Envoy isn’t a web server). Neither open-source NGINX nor HAProxy handle service discovery very well (though NGINX Plus has some options here). And neither has quite the same stats support that a properly-configured Envoy does.

Overall, what we’re finding is that Envoy is looking promising for being able to support many of our needs with just a single piece of software, rather than needing to mix and match things.

Envoy Architecture
Categories: Architecture

Sponsored Post: Aerospike, Loupe, Clubhouse, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Tue, 02/28/2017 - 17:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • DBTA Roundtable Webinar: Fast Data: The Key Ingredients to Real-Time Success. Thursday February 23, 2017 | 11:00 AM Pacific Time. Join Stephen Faig, Research Director Unisphere Research and DBTA, as he hosts a roundtable discussion covering new technologies that are coming to the forefront to facilitate real-time analytics, including in-memory platforms, self-service BI tools and all-flash storage arrays. Brian Bulkowski, CTO and Co-Founder of Aerospike, will be speaking along with presenters from Attunity and Hazelcast. Learn more and register.

  • Your event here!
Cool Products and Services
  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: Architecture