Warning: Table './devblogsdb/cache_page' is marked as crashed and last (automatic?) repair failed query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'http://www.softdevblogs.com/?q=aggregator/sources/3' in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc on line 135

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 729

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 730

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 731

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 732
Software Development Blogs: Programming, Software Testing, Agile, Project Management
Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/common.inc on line 153.
Syndicate content
Updated: 35 min 18 sec ago

Stuff The Internet Says On Scalability For February 5th, 2016

Fri, 02/05/2016 - 17:56

We have an early entry for the best vacation photo of the century. 

 

If you like this sort of Stuff then please consider offering your support on Patreon.
  • 1 billion: WhatsApp users; 3.5 billion: Facebook users in 2030; $3.5 billion: art sold online; $150 billion: China's budget for making chips; 37.5MB: DNA information in a single sperm; 

  • Quotable Quotes:
    • @jeffiel: "But seriously developers, trust us next time your needs temporarily overlap our strategic interests. And here's a t-shirt."
    • @feross: Modern websites are the epitome of inefficiency. Using giant multi-MB javascript files to do what static HTML could do in 1999.
    • Rob Joyce (NSA): We put the time in …to know [that network] better than the people who designed it and the people who are securing it,' he said. 'You know the technologies you intended to use in that network. We know the technologies that are actually in use in that network. Subtle difference. You'd be surprised about the things that are running on a network vs. the things that you think are supposed to be there.
    • @MikeIsaac: i just realized how awkward Facebook's f8 conference is gonna be this year
    • @Nick_Craver: Stats correction: Stack Overflow did 157,370,800,409 redis ops in the past 30 days, almost always under 2% CPU:
    • @BenedictEvans: The global SMS system does around 20bn messages a day. WhatsApp is now doing 42bn. With 57 engineers.
    • @jaygoldberg: WhatsApp has the benefit of running on top of the world's data networks which employ a few more engineers... 
    • @anildash: It’s odd that developers think Twitter is so hostile while Facebook shuts down stuff like Parse & FBML + cuts back the Instagram & FB APIs.
    • @asynchio:  I use to think CEP = stateful business rules engine + inference + stream processing. Has it changed?
    • @Marco_Rasp: "SOA is about reuse, MicroServices about time to market." @samnewman #microxchg
    • @pfhllnts: "I predict quantum containers where Docker exists both inside and outside a container." @marcoceppi #fosdem
    • @viktorklang: Awesome story: 295x speedup with Akka Streams on same HW compared to Rails :) 
    • krinchan: Yes. Because a currency almost completely controlled by Chinese miners who are strangling the network at 1MB blocks, causing transaction times in excess of three hours at peak and just introduced the ability to arbitrarily reverse those transactions during the lag is totally going to handle DraftKings and FanDuel.
    • @mpesce: 1/The Apple AX series SOCs are more than powerful enough to run a Hololens-type device very effectively.
    • Matthew Yglesias: Amazon's leadership, from CEO Jeff Bezos on down, are deliberately redeploying every dollar of revenue Amazon earns into making the company bigger and bigger.
    • German forest ranger finds that trees have social networks: trees operate less like individuals and more as communal beings. Working together in networks and sharing resources, they increase their resistance to threats
    • @ValaAfshar: 11 years ago some guy named Mark Zuckerberg talks about his new company. He is now 4th richest person in the world. 
    • Bernard Marr: In China, the government is rolling out a social credit score that aggregates not only a citizen’s financial worthiness, but also how patriotic he or she is, what they post on social media, and who they socialize with
    • @Carnage4Life: Facebook is valued at $326 billion and worth more than Exxon Mobil. Remember when people freaked out at $15B value? 
    • @Nick_Craver: High levels of efficiency at scale aren't one thing; it's a thousand things. Many we haven't really shared in detail...and we should.
    • 2BuellerBells: Things to reinvent: Event loops (done!) Unix (In progress!) Erlang (est. 5 years)
    • @LusciousPear: I'm consistently seeing GETs from @googlecloud storage 2-5x faster than S3. niiiice
    • Kevin Old: The future looks mighty scalable.
    • @BenedictEvans: All curation grows until it requires search. All search grows until it requires curation.
    • @Carnage4Life: Google has 7 services with 1B monthly active users; Gmail, Search, Chrome, Android, Maps, YouTube and Google Play 
    • @jmhodges: That's 1.3 million unique domains in a single day. Yesterday. Let's Encrypt is doing a thing.
    • @danielbryantuk: "60% percent of app users rate performance/response time ahead of features" @grabnerandi  #OOP2016 
    • @tdeekens: Sometimes Monoliths don’t get enough respect. They’re part of our revenue system allowing us to build Microservices. They gave us a business
    • Searching for the Algorithms Underlying Life: Valiant’s self-stated goal is to find “mathematical definitions of learning and evolution which can address all ways in which information can get into systems.” If successful, the resulting “theory of everything”...would literally fuse life science and computer science together.
    • @mountain_ghosts: 1995: the information superhighway will mean anyone can do anything from anywhere 2015: must be willing to relocate to San Francisco

  • Fingerprinting made burglars put on gloves. CCTV made kids pull their hoods up. Spying made honest people use encryption. Forensics: What Bugs, Burns, Prints, DNA and More Tell Us About Crime.

  • So that's what bandwidth means. ucaetano: The bandwidth doesn't depend on the frequency you're occupying, but on the amount of spectrum available: you "usually" get in the order of 1 bps for every Hz of spectrum available for mobile: a 20Mz chunk of spectrum will give you ~20Mbps, no matter if it is 700MHz or 5 GHz. Higher frequencies have awful penetration and range, that's why today you define who wins in the mobile game by the amount of 700MHz and 800MHz spectrum they own. In other words, lower frequency spectrum is (within certain limits) always better.

  • Even spies have limits. Optic Nerve: millions of Yahoo webcam images intercepted by GCHQ. A British surveillance agency suffered the indignity of only saving images every five minutes from user feeds to reduce server load. My kingdom for a cloud! Why? They needed data to train their face recognition algorithms. That's what happens if you aren't Google.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

A Case Study: WordPress Migration for Shift.ms

Wed, 02/03/2016 - 17:56

The case study presented involves a migration from custom database to WordPress. The company with the task is Valet and it has a vast portfolio of previously done jobs that included shifts from database to WordPress, multisite-to-multisite, and multisite to single site among others. The client is Shift.ms.

Problem

The client, Shift.ms, presented a taxing problem to the team. Shift.ms had a custom database that they needed migrated to WordPress. They had installed a WordPress/BuddyPress and wanted their data moved into this new installation. All this may seem rather simple. However, there was one problem; the client had some data in the newly installed WordPress that they intended to keep.

Challenges

The main problem was that the schema for the database and that of WordPress are very different in infrastructure. The following issues arose in an effort to deal with the problem:

Categories: Architecture

Sponsored Post: Netflix, Macmillan Learning, Aerospike, TrueSight Pulse, LaunchDarkly, Robinhood, StatusPage.io, Redis Labs, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Tue, 02/02/2016 - 18:45

Who's Hiring?
  • Macmillan Learning, a premier e-learning institute, is looking for VP of DevOps to manage the DevOps teams based in New York and Austin. This is a very exciting team as the company is committed to fully transitioning to the Cloud, using a DevOps approach, with focus on CI/CD, and using technologies like Chef/Puppet/Docker, etc. Please apply here.

  • DevOps Engineer at Robinhood. We are looking for an Operations Engineer to take responsibility for our development and production environments deployed across multiple AWS regions. Top candidates will have several years experience as a Systems Administrator, Ops Engineer, or SRE at a massive scale. Please apply here.

  • Senior Service Reliability Engineer (SRE): Drive improvements to help reduce both time-to-detect and time-to-resolve while concurrently improving availability through service team engagement.  Ability to analyze and triage production issues on a web-scale system a plus. Find details on the position here: https://jobs.netflix.com/jobs/434

  • Manager - Performance Engineering: Lead the world-class performance team in charge of both optimizing the Netflix cloud stack and developing the performance observability capabilities which 3rd party vendors fail to provide.  Expert on both systems and web-scale application stack performance optimization. Find details on the position here https://jobs.netflix.com/jobs/860482

  • Senior Devops Engineer - StatusPage.io is looking for a senior devops engineer to help us in making the internet more transparent around downtime. Your mission: help us create a fast, scalable infrastructure that can be deployed to quickly and reliably.

  • Software Engineer (DevOps). You are one of those rare engineers who loves to tinker with distributed systems at high scale. You know how to build these from scratch, and how to take a system that has reached a scalability limit and break through that barrier to new heights. You are a hands on doer, a code doctor, who loves to get something done the right way. You love designing clean APIs, data models, code structures and system architectures, but retain the humility to learn from others who see things differently. Apply to AppDynamics

  • Software Engineer (C++). You will be responsible for building everything from proof-of-concepts and usability prototypes to deployment- quality code. You should have at least 1+ years of experience developing C++ libraries and APIs, and be comfortable with daily code submissions, delivering projects in short time frames, multi-tasking, handling interrupts, and collaborating with team members. Apply to AppDynamics
Fun and Informative Events

  • Your event could be here. How cool is that?
Cool Products and Services
  • Aerospike Shows Fivefold Cost Advantage over Cassandra at Higher Performance in DataStax’s Own Benchmark. A recent NoSQL database performance test by DataStax concluded that Cassandra bested Couchbase, MongoDB and HBase. Since Aerospike wasn’t included in the evaluation, we ran the benchmark against Aerospike in the same test cases. The result? Aerospike dramatically outperformed Cassandra AND cost 5 times less. Read the details here

  • Dev teams are using LaunchDarkly’s Feature Flags as a Service to get unprecedented control over feature launches. LaunchDarkly allows you to cleanly separate code deployment from rollout. We make it super easy to enable functionality for whoever you want, whenever you want. See how it works.

  • TrueSight Pulse is SaaS IT performance monitoring with one-second resolution, visualization and alerting. Monitor on-prem, cloud, VMs and containers with custom dashboards and alert on any metric. Start your free trial with no code or credit card.

  • Turn chaotic logs and metrics into actionable data. Scalyr is a tool your entire team will love. Get visibility into your production issues without juggling multiple tools and tabs. Loved and used by teams at Codecademy, ReturnPath, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below...

Categories: Architecture

The Big List of Alternatives to Parse

Tue, 02/02/2016 - 17:56

Parse is not going away. It’s going to get better.
Ilya Sukhar — April 25th, 2013 on the Future of Parse

 

Parse is dead. The great diaspora has begun. The gold rush is on. There’s a huge opportunity for some to feed and grow on Parse’s 600,000 fleeing customers.

Where should you go? What should you do? By now you’ve transitioned through all five stages of grief and ready for stage six: doing something about it. Fortunately there are a lot of options and I’ve gathered as many resources as I can here in one place.

There is a Lot Pain Out There

Parse closing is a bigger deal than most shutterings. There’s even a petition: Don't Shut down Parse.com. That doesn’t happen unless you’ve managed to touch people. What could account for such an outpouring of emotion?

Parse and the massive switch to mobile computing grew up at the same time. Mobile is by definition personal. Many programmers capable of handling UI programming challenge were not as experienced with backend programming and Parse filled that void. When a childhood friend you grew to depend on dies, it hurts. That hurt is deep. It goes into the very nature of how you make stuff, how you grow, how you realize your dreams, how you make a living. That’s a very intimate connection.

For a trip through memory lane Our Incredible Journey is a tumblr chronicling many services that are no longer with us.

Some reactions from around the net:

maxado_zdl: F*ck you facebook!!!!!!!!!!!!!!!!!!!!!!!!

pacp_ec: Damn it Facebook only George R. R. Martin is allowed to kill my heroes

Mythul: I really hate facebook right now ! Thanks for screwing up my apps with your bad business model!

Mufro: Damn. We've been slowly migrating our smaller apps to Parse as we make annual updates. Now we're trying to figure out what we're gonna do... go back to the pain of rolling our own server backends out? This leaves a pretty big hole in the market IMO. I don't know of anyone who gets you off the ground as quickly and affordably as Parse does. It's been a joy to use their product, but I knew deep down it was too good to be true. I guess we'll have to take a look at AWS again, maybe Azure. We use Firebase in another project, so we might check that out too. This sucks though.

samwize7: When Facebook acquired Parse, I thought it is good news since they ain't profitable, and now they have a backing of a giant, who tried hard to woo developers. I built many mobile apps using Parse, and has always been a fan of how they build a product for developers. Their documentation is awesome, their free tier is generous, their SDK covers widely. Today, their announcement is a sad news. And once again, proves that we can't trust Facebook.

clev1: This literally just ruined my day....I've got 2 major projects near completion that I've been using Parse as a BaaS for. Anyone with experience know how difficult or a transition it is to switch to Firebase?

solumamantis: I just can't believe the service is being retired... I started using three months ago - my new app coming out soon is completely reliant on it..... I will have a look on Firebase, but honestly I think i will build my own Parse/Node.js version and manage it myself....

changingminds: What the f*ck. Wtf am I supposed to do with 120k users who currently use my app that uses parse? I gotta redo the entire f*cking backend? F*cking bullsh*t.

manooka: My entire startup relies on Parse. I developed the website and apps myself as this was perfect for me as a Front-end developer without having to worry about back-end servers/databases etc. This is SERIOUSLY bad news.

stuntmanmikey: I'm a full-stack developer who is part of a startup that depends on Parse. As the only developer, the amount of time we've saved NOT having to write a data access layer and web service layer has been a windfall for us. Now I'm left to either switch to a similar product (Firebase just doesn't have the same appeal to me) or implement the backend myself at great cost.

neckbeardfedoras: The thing is, most of the folks using Parse probably use it because they're not full stack or back end developers. Removal of Parse means more time or money spent on resources to manage a back end system.

Why did Facebook Shutdown Parse?
Categories: Architecture

A Patreon Architecture Short

Mon, 02/01/2016 - 17:56

Patreon recently snagged $30 Million in funding. It seems the model of pledging $1 for individual feature releases or code changes won't support fast enough growth. CEO Jack Conte says: We need to bring in so many people so fast. We need to keep up with hiring and keep up with making all of the things.

Since HighScalability is giving Patreon a try I've naturally wondered how it's built. Modulo some serious security issues Patreon has always worked well. So I was interested to dig up this nugget in a thread on the funding round where the Director of Engineering at Patreon shares a little about how Patreon works:

  • Server is in Python using Flask and SQLAlchemy, 
  • Runs on AWS (EC2, RDS (MySQL), and some Redis, Celery, SQS, etc. to boot). 
  • A few microservices here and there in other languages too (e.g. real time chat server with Node & Firebase)
  • Web code is written in React (with some legacy code in Angular). We tend to use Redux for the non-component pieces, but are still trying out new React-compatible libraries here and there.
  • iOS and Android code are written in Objective-C and Java, respectively. 
  • We use Realm on both platforms for data storage
  • Most of the rest is pretty standard modern project stuff (CocoaPods for iOS, Gradle on Android, etc.)

For this time period it seems like a good set of technologies to use for the type of application Patreon is. It's interesting to see Angular as referred to as legacy code. React seems to be winning the framework wars.

The use of Realm is notable on the mobile platform as a common storage layer. Realm's simplicity is attractive.

The use of microservices may have helped Patreon dodge the Parse closing down bullet. Instead of trying to find one backend to rule them all they picked Firebase, a more targeted technology, to implement a specific feature. Service diversification is a great way to manage service failure risk.

Categories: Architecture

Stuff The Internet Says On Scalability For January 29th, 2016

Fri, 01/29/2016 - 17:56

Hey, it's HighScalability time:


This is a trace of a Google search query. A single query might touch a couple thousand machines.

 

If you like this Stuff then please consider supporting me on Patreon.
  • 88: the too short life of Marvin Minsky; $18.4 billion: profit made by Apple in 3 months; 100M: hours of video watched on Facebook each day; 1.59 billion: Facebook users; $115B: size of game market by 2020; 12 years: Mars rover still going strong; 96.3m: barrels of oil produced per day; 570 Billion: object brighter than the Sun; 134 pounds: carried by drones;  $2.4 billion: AWS Q4 sales; 2.5 million: advertisers on the Facebook;

  • Quotable Quotes:
    • @ptaoussanis: Real-world scaling 101: be in the habit of routinely, objectively asking what parts of your system could stand to be simplified or removed
    • @Carnage4Life: Azure revenue up 140%. Search revenue from #BingAds up 21%. Microsoft is killing it in the cloud
    • @gabriel_boya: Scaling up a Cloud Service on @azure takes so many hours that your customers may be gone by the time your instances are allocated...
    • AJ007: Facebook is the only platform that lets advertisers target a mass audience with very fine demographic precision. Google you lose the demographics. Television, you lose the the precision.
    • Junaid Anwar: It is to be noted that clustering [node.js] yielded two times the performance as compared to the non-clustering case which shows that performance linearly increases with processing cores when clustering is used.
    • crash41301: Our company has been slowly shrinking the hundreds of services we have down to a handful of larger, automated tested services and the dev team (about 50) likes it much more.
    • @swardley: Compute is the activity, Architecture is the practice
    • van lessen: Self-Contained Systems (SCS) describe an architectural approach to build software systems, e.g. to split monoliths into multiple functionally separated, yet largely autonomous web applications. 
    • R. P. Feynman: What is the cause of management's fantastic faith in the machinery?
    • Steven Max Patterson: Facebook filters much from the raw newstream and gives me what it thinks I want with about 20% accuracy.
    • Brandon Butterworth~ a single mega data centre might simply represent a single, large potential point of failure
    • boggzPit: Damn it Facebook. Why did I ever believe you could handle being cool to developers?
    • Vadim Tkachenko: To recap an interesting point in that post: when using 48 cores with the server, the result was worse than with 12 cores. I wanted to understand the reason is was true, so I started digging. My primary suspicion was that Java (I never trust Java) was not good dealing with 100GB of memory.
    • Seth Lloyd: Our algorithm shows that you don't need a big quantum computer to kick some serious topological butt...You could find the topology of simple structures on a very simple quantum computer. 
    • Robert Scoble: When he was doing his thesis 20 years ago, it took him two years to analyze just 24 hours of data from farms (he pulls in data from satellites, Doppler radar and even drones). Today, his company does the same thing in seconds.
    • @jgrahamc: Devotees of microservices use 'monolith' as a derogatory term; wait 10 years and we'll be using 'spider's web' as a derogatory term.
    • @mweagle: I see your femtoservice, and pivot with a single source code point: “yoctoservice” :) #disrupt #unicorn #M&A
    • milesrout: The entire point of Docker is that you use it for everything. It's a universal application image format. That is the point. It's contained, secure, and childproof. That is the point. It's not just about scalability. If I could use a desktop operating system where all programs ran as docker containers, I'd do that too. That's what they're for.
    • Bill Wash: I will never pass up an opportunity to help out a colleague, and I’ll remember the days before I knew everything.
    • @CarlHasselskog:  my startup handles ~10 million uploaded files/day with two employees in total (entire company). That's largely thanks to you guys.
    • AJ Kohn: December saw more negative numbers with a 6.96% decrease, year over year, in desktop search volume. Every month in 2015 had lower desktop query volume than the same month in 2014. Every. Month.
    • Jerry Chen: Every startup has a different size unit of value. Bigger is not better, smaller is not better.
    • sacundim: No, the goal of normalization is to eliminate logical inconsistencies—data sets that entail two or more different answers to the same question. 
    • Jake Archibald: Streams can be used to do fun things like turn clouds to butts, transcode MPEG to GIF, but most importantly, they can be combined with service workers to become the fastest way to serve content.
    • Solomon Hykes: Computers do run only one unikernel at a time. It’s just that sometimes they are virtual computers. Remember that virtualization is increasingly hardware-assisted, and the software parts are mature. So for many use cases it’s reasonable to separate concerns and just assume that VMs are just a special type of computer.

  • Relying on a tool backed by a big company is no protection. Facebook is closing down Parse. This is a stunner because Parse was a popular and well made service, used by millions of now adrift mobile apps. What happened? This might be it: "Facebook also would have had to invest untold millions of dollars in capital and, more importantly, engineering talent, to get the Parse business fully off the ground to have a better chance at making a dent in competitors like Amazon, Microsoft and Google." How about Firebase? The Firebase founder responds: "We're not going anywhere. What makes us different? Firebase is very complementary to Google's other product offerings. Cloud for one, as well as Angular, Polymer, GCM, etc." The moral of the store is told by bsaul: "parse wasn't a core service for facebook, nor a relevant source of a revenue AND their API wasn't standard. Those points combined made it very risky for people to use it." 

  • The Internet will soon be eating a lot of Brotli, Google's new lossless compression algorithm that is making the Internet 17-25% faster. Support will be in Chrome and other browsers, but server side support may take longer. Why does it only work with https? Richard Coles: one reason why this is limited to https is to stop it being mangled by proxies, which has been a practical problem in the past with encodings.

  • Young Skynet is continuing its dastardly plan of self-creation by seeding deep learning both far and wide. Microsoft Open Sources Deep Learning, AI Toolkit On GitHub. Twitter released Distributed learning in TorchTeach Yourself Deep Learning with TensorFlow and Udacity.

  • While the Super Bowl will make a mess of local traffic, it's great for cell phone service. Verizon spent $70 million to triple Bay Area LTE capacity ahead of the Super Bowl. They have more than tripled its 4G LTE network capacity; Build 16 new area cell sites; Install 75 small cells; Boost capacity by adding 37 XLTE to existing sites; Complete preparations to deploy 14 mobile cell sites in high traffic locations.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Tinder: How does one of the largest recommendation engines decide who you'll see next?

Wed, 01/27/2016 - 17:56

We've heard a lot about the Netflix recommendation algorithm for movies, how Amazon matches you with stuff, and Google's infamous PageRank for search. How about Tinder? It turns out Tinder has a surprisingly thoughtful recommendation system for matching people.

This is from an extensive profile, Mr. (Swipe) Right?, on Tinder founder Sean Rad:

Categories: Architecture

Tinder: How does one of the largest recommendation engines decide who you'll see next?

Wed, 01/27/2016 - 17:56

We've heard a lot about the Netflix recommendation algorithm for movies, how Amazon matches you with stuff, and Google's infamous PageRank for search. How about Tinder? It turns out Tinder has a surprisingly thoughtful recommendation system for matching people.

This is from an extensive profile, Mr. (Swipe) Right?, on Tinder founder Sean Rad:

Categories: Architecture

Design of a Modern Cache

Mon, 01/25/2016 - 17:56

This is a guest post by Benjamin Manes, who did engineery things for Google and is now doing engineery things for a new load documentation startup, LoadDocs.

Caching is a common approach for improving performance, yet most implementations use strictly classical techniques. In this article we will explore the modern methods used by Caffeine, an open-source Java caching library, that yield high hit rates and excellent concurrency. These ideas can be translated to your favorite language and hopefully some readers will be inspired to do just that.

Eviction Policy

A cache’s eviction policy tries to predict which entries are most likely to be used again in the near future, thereby maximizing the hit ratio. The Least Recently Used (LRU) policy is perhaps the most popular due to its simplicity, good runtime performance, and a decent hit rate in common workloads. Its ability to predict the future is limited to the history of the entries residing in the cache, preferring to give the last access the highest priority by guessing that it is the most likely to be reused again soon...

Categories: Architecture

Stuff The Internet Says On Scalability For January 22nd, 2016

Fri, 01/22/2016 - 17:55

Hey, it's HighScalability time:


The Imaginary Kingdom of Aurullia. A completely computer generated fractal. Stunning and unnerving.

 

If you like this Stuff then please consider supporting me on Patreon.
  • 42,000: drones from China securing the South China Sea; 1 billion: WhatsApp active users; 2⁻¹²²: odds of a two GUIDs with 122 random bits colliding; 25,000 to 70,000: memory chip errors per billion hours per megabit; 81,500: calories in a human body; 62: people as wealthy as half of world's population; 1.66 million: App Economy jobs in the US; 521 years: half-life of DNA; 0.000012%: air passenger fatalities; $1B: Microsoft free cloud resources for nonprofits; 4000-7000+: BBC stats collected per second; $1 billion: Google's cost to taste Apple's pie;

  • Quotable Quotes:
    • @mcclure111: 1995: Every object in your home has a clock & it is blinking 12:00 / 2025: Every object in your home has a IP address & the password is Admin
    • @notch: Coming soon to npm: tirefire.js, an asynchronous framework for implementing helper classes for reinventing the wheel. Based on promises.
    • @ayetempleton: Fun fact: You are MORE likely to win a million or more dollars in the #powerball lottery than to lose an #AWS #S3 object in a given year.
    • @viktorklang: IMO biggest lie in performance work: constant factors don't matter in Big-Oh.
    • Flavien Boucher: We all came to the conclusion that Docker is adding a complexity layer compare to a virtual machine approach, and this complexity will be for the deployment, development and build.
    • @Frances_Coppola: Uber is a cab cartel. And AirBNB is wealthy - though its suppliers aren't. They are simply firms with apps.
    • Susan Sontag: The method especially appeals to people handicapped by a ruthless work ethic – Germans, Japanese and Americans. Using a camera appeases the anxiety which the work driven feel about not working when they are on vacation and supposed to be having fun. They have something to do that is like a friendly imitation of work: they can take pictures.
    • @SachaNauta: "It's never been easier to be a billionaire and never been harder to be a millionaire" @profgalloway #DLD16
    • @Techmeme: Google Play saw 100% more downloads than iOS App Store, but Apple generated 75% more revenue 
    • Ryan Shea: we’ve concluded that 8MB blocks are simply too large to be considered safe for the network at this point in time, considering the current global bandwidth levels.
    • @RichRogersHDS: "In the old world you spent 30% of your time building a great service & 70% shouting about it. In the new world, that inverts." - Jeff Bezos
    • @thetinot: When you have an SDN, yes, networking throughput does grow on trees. Why @googlecloud is faster than #AWS and #Azure 
    • @GOettingerEU: Digital tech has contributed to around 1/3 of EU GDP growth in over the past decade and I believe this number will continue to grow #wef16
    • @COLRICHARDKEMP: More women fly F16s in Israel than drive cars in Saudi Arabia. KA. 
    • @JoshZumbrun: The total collapse in shopping mall construction
    • @jeffjarvis: 44 million people saw NY Fashion Show content on Instagram last year says Instagram's Marne Levine. Attn: Conde & Hearst!  #DLD16
    • @HackerNewsOnion: Developer Accused Of Unreadable Code Refuses To Comment
    • Lloyds online banking: in a 60-second period: 12,900 people visit its website, 400 bills are paid, 1,500 customers log onto the mobile app, 350 transfers are made and 3,000+ logins
    • @bdha: 2013: DevOps 2014: Docker 2015: Containers 2016: Unikernels 2017: Threads 2018: Syscalls 2019: Inodes
    • hacknat: Two things need to happen to make unikernels attractive. A new Hypervisor needs to get made, one that is just as extensible as an OS around the isolated primitives. It should also have something extra too (like the ability to fine tune resource management better than an OS can). Secondly a user friendly mechanism like Docker needs to happen.

  • It's a winner take all world, but not everywhere. Brian Brushwood on Cordkillers with an insightful breakdown of how the new diversified market for TV content has actually become far less of a winner take all system. We have more good content than ever. Gone are the days of Mash when everyone watched the same show at the same time. Is it bad that actors are making less? No. We are seeing the destruction of the tournament, as explained in the book Freakonomics, is the idea that those at the very top make all the money, those at the bottom of the pyramid make next to nothing. And the winners only have to win by a nose to reap all the rewards, the don't even need to win on merit. This is an inefficient system. Now we are reaching an artistically efficient system. If you have a story to tell and no budget you can tell it on YouTube. This is the democratization of talent. It's inconvenient for those who used to be at the top. What we have now is more working actors producing more content than ever.  And since a lot of this content does not have to pander to advertisers to get made the content is more diverse and more interesting than ever as well.

  • The RAMCloud Storage System: RAMCloud combines low-latency, large scale, and durability. Using state of the art networking with kernel bypass, RAMCloud expects small reads to complete in less that 10µs on a cluster of 10,000 nodes. This is 50 – 1,000 times faster that storage systems commonly in use.

  • All Change Please. Adrian Colyer makes the case that we are transitioning to a new part of the technology cycle that promises great change. Networking: 40Gbps and 100Gbps ethernet. Memory: battery backed RAM; 3D XPoint, MRAM, MeRAM, etc. Storage: NVRAM and fast PCIe. Processing: GPUs; integrated on processor FPGAs; hardware transactional memory. This is the question: What happens when you combine fast RDMA networks with ample persistent memory, hardware transactions, enhanced cache management support and super-fast storage arrays? It’s a whole new set of design trade-offs that will impact the OS, file systems, data stores, stream processing, graph processing, deep learning and more. And this is before we’ve even introduced integration with on-board FPGAs, and advances in GPUs…

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Why does Unikernel Systems Joining Docker Make A Lot of Sense?

Thu, 01/21/2016 - 21:40

Unikernel Systems Joins Docker. Now this is an interesting match. The themes are security and low overhead, though they do seem to solve the same sort of problem.

So, what's going on?

In FLOSS WEEKLY 302 Open Mirage, starting at about 10 minutes in, there are a series of possible clues. Dr. Anil Madhavapeddy, former CTO of Unikernel Systems, explains their motivation behind the creation of unikernels. And it's a huge and exciting vision...

Categories: Architecture

Building An Infinitely Scaleable Online Recording Campaign For David Guetta

Wed, 01/20/2016 - 17:56

This is a guest repost of an interview posted by Ryan S. Brown that originally appeared on serverlesscode.com. It continues our exploration of building systems on top of Lambda.

Paging David Guetta fans: this week we have an interview with the team that built the site behind his latest ad campaign. On the site, fans can record themselves singing along to his single, “This One’s For You” and build an album cover to go with it.

Under the hood, the site is built on Lambda, API Gateway, and CloudFront. Social campaigns tend to be pretty spiky – when there’s a lot of press a stampede of users can bring infrastructure to a crawl if you’re not ready for it. The team at parall.ax chose Lambda because there are no long-lived servers, and they could offload all the work of scaling their app up and down with demand to Amazon.

James Hall from parall.ax is going to tell us how they built an internationalized app that can handle any level of demand from nothing in just six weeks.

The Interview
Categories: Architecture

Sponsored Post: Netflix, Macmillan, Aerospike, TrueSight Pulse, LaunchDarkly, Robinhood, StatusPage.io, Redis Labs, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Tue, 01/19/2016 - 18:01

Who's Hiring?
  • Manager - Site Reliability Engineering: Lead and grow the the front door SRE team in charge of keeping Netflix up and running. You are an expert of operational best practices and can work with stakeholders to positively move the needle on availability. Find details on the position here: https://jobs.netflix.com/jobs/398

  • Macmillan Learning, a premier e-learning institute, is looking for VP of DevOps to manage the DevOps teams based in New York and Austin. This is a very exciting team as the company is committed to fully transitioning to the Cloud, using a DevOps approach, with focus on CI/CD, and using technologies like Chef/Puppet/Docker, etc. Please apply here.

  • DevOps Engineer at Robinhood. We are looking for an Operations Engineer to take responsibility for our development and production environments deployed across multiple AWS regions. Top candidates will have several years experience as a Systems Administrator, Ops Engineer, or SRE at a massive scale. Please apply here.

  • Senior Service Reliability Engineer (SRE): Drive improvements to help reduce both time-to-detect and time-to-resolve while concurrently improving availability through service team engagement.  Ability to analyze and triage production issues on a web-scale system a plus. Find details on the position here: https://jobs.netflix.com/jobs/434

  • Manager - Performance Engineering: Lead the world-class performance team in charge of both optimizing the Netflix cloud stack and developing the performance observability capabilities which 3rd party vendors fail to provide.  Expert on both systems and web-scale application stack performance optimization. Find details on the position here https://jobs.netflix.com/jobs/860482

  • Senior Devops Engineer - StatusPage.io is looking for a senior devops engineer to help us in making the internet more transparent around downtime. Your mission: help us create a fast, scalable infrastructure that can be deployed to quickly and reliably.

  • Software Engineer (DevOps). You are one of those rare engineers who loves to tinker with distributed systems at high scale. You know how to build these from scratch, and how to take a system that has reached a scalability limit and break through that barrier to new heights. You are a hands on doer, a code doctor, who loves to get something done the right way. You love designing clean APIs, data models, code structures and system architectures, but retain the humility to learn from others who see things differently. Apply to AppDynamics

  • Software Engineer (C++). You will be responsible for building everything from proof-of-concepts and usability prototypes to deployment- quality code. You should have at least 1+ years of experience developing C++ libraries and APIs, and be comfortable with daily code submissions, delivering projects in short time frames, multi-tasking, handling interrupts, and collaborating with team members. Apply to AppDynamics
Fun and Informative Events
  • Aerospike, the high-performance NoSQL database, hosts a 1-hour live webinar on January 28 at 1PM PST / 4 PM EST on the topic of "From Development to Deployment" with Docker and Aerospike. This session will cover what Docker is and why it's important to Developers, Admins and DevOps when using Aerospike; it features an interactive demo showcasing the core Docker components and explaining how Aerospike makes developing & deploying multi-container applications simpler. Please click here to register.

  • Your event could be here. How cool is that?
Cool Products and Services
  • Dev teams are using LaunchDarkly’s Feature Flags as a Service to get unprecedented control over feature launches. LaunchDarkly allows you to cleanly separate code deployment from rollout. We make it super easy to enable functionality for whoever you want, whenever you want. See how it works.

  • TrueSight Pulse is SaaS IT performance monitoring with one-second resolution, visualization and alerting. Monitor on-prem, cloud, VMs and containers with custom dashboards and alert on any metric. Start your free trial with no code or credit card.

  • Turn chaotic logs and metrics into actionable data. Scalyr is a tool your entire team will love. Get visibility into your production issues without juggling multiple tools and tabs. Loved and used by teams at Codecademy, ReturnPath, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below...

Categories: Architecture

Use Google For Throughput, Amazon And Azure For Low Latency

Mon, 01/18/2016 - 17:56

Which cloud should you use? It may depend on what you need to do with it. What Zach Bjornson needs to do is process large amounts scientific data as fast as possible, which means reading data into memory as fast as possible. So, he made benchmark using Google's new multi-cloud PerfKitBenchmarker, to figure out which cloud was best for the job.

The results are in a very detailed article: AWS S3 vs Google Cloud vs Azure: Cloud Storage Performance. Feel free to datamine the results for more insights, but overall his conclusions are:

Categories: Architecture

Stuff The Internet Says On Scalability For January 15th, 2016

Fri, 01/15/2016 - 17:56

Hey, it's HighScalability time:


Space walk from 2001: A Space Odyssey? Nope. A base jump from the CN Tower in Toronto.

 

If you like this Stuff then please consider supporting me on Patreon.
  • 13.5TB: open data from Yahoo for machine learning; 1+ exabytes: data stored in the cloud; 13: reasons autonomous cars should have steering wheels; 3,000: kilowatt-hours of energy generated by the solar bike path; 10TB: helium-filled hard disk; $224 Billion: 2016 gadget spending in US; 85: free ebooks; 17%: Azure price drop on some VMs; 20.5: tons of explosives detonated on Mythbusters; 20 Billion: Apple’s App Store Sales; 70%: Global Internet traffic goes through Northern Virginia; 12: photos showing the beauty of symmetry; 

  • Quotable Quote:
    • @WhatTheFFacts: Scaling Earth's 'life' to 46 years, the industrial revolution began 1 minute ago -- In that time we've destroyed half the world's forests.
    • David Brin: The apotheosis of Darth Vader was truly disgusting. Saving one demigod—a good demigod, his son—wiped away all his guilt from slaughtering billions of normal people.
    • Brian Brazil: In today’s world, having a 1:1 coupling between machines and services is becoming less common. We no longer have the webserver machine, we have one machine which hosts one part of the webserver service. 
    • @iamxavier: "Snapchat is said to have 7 billion mobile video views vs Facebook’s 8 bil.The kicker: Fb has 15x Snapchat’s users."
    • Charlie Stross: Do you want to know the real reason George R. R. Martin's next book is late? it's because keeping track of that much complexity and so many characters and situations is hard work, and he's not getting any younger. 
    • @raju: Unicorn-Size Losses: @Uber lost $671.4 million in 2014 & $987.2 million in the first half of 2015
    • @ValaAfshar: 3.8 trillion photos were taken in all of human history until mid-2011. 1 trillion photos were taken in 2015 alone
    • @ascendantlogic: 2010: Rewrite all the ruby apps with javascript 2012: Rewrite all the javascript apps with Go 2014: Rewrite all the Go apps with Rust
    • @kylebrussell: “Virtual reality was tried in the 90s!” Yeah, with screens that had 7.9% of the Oculus Rift CV1 resolution
    • @kevinmarks: #socosy2016 @BobMankoff: people don't like novelty - they like a little novelty in a cocoon of familiarity, that they could have thought of
    • @toddhoffious: The problem nature has solved is efficient variable length headers. Silicon doesn't like them for networks, or messaging protocols. DNA FTW.
    • @jaykreps: I'm loving the price war between cloud providers, cheap compute enables pretty much everything else in technology. 
    • The Confidence Game: Transition is the confidence game’s great ally, because transition breeds uncertainty. There’s nothing a con artist likes better than exploiting the sense of unease we feel when it appears that the world as we know it is about to change.
    • @somic: will 2016 be the year of customer-defined allocation strategies for aws spot fleet? (for example, through a call to aws lambda)
    • beachstartup: i run an infrastructure startup. the rule of thumb is once you hit $20-99k/month, you can cut your AWS bill in half somewhere else. sites in this phase generally only use about 20% of the features of aws.
    • @fart: the most important part of DevOps to me is “kissing the data elf”
    • @destroytoday: In comparison, @ProductHunt drove 1/4 the traffic of Hacker News, but brought in 700+ new users compared to only 20 from HN.
    • @aphyr~ Man, if people knew even a *tenth* of the f*cked up shit tech company execs have tried to pull... Folks are *awfully* polite on twitter.
    • @eric_analytics: It took Uber five years to get to a billion rides, and its Chinese rival just did it in one
    • lowpro: Being a 19 year old college student with many friends in high school, I can say snapchat is the most popular social network, followed by Instagram then Twitter, and lastly Facebook. If something is happening, people will snap and tweet about it, Instagram and Facebook are reserved for bigger events that are worth mentioning, snapchat and Twitter are for more day to day activities and therefore get used much more often.
    • Thaddeus Metz: The good, the true, and the beautiful give meaning to life when we transcend our animal nature by using our rational nature to realize states of affairs that would be appreciated from a universal perspective.
    • Reed Hastings: We realized we learned best by getting in the market and then learning, even if we’re less than perfect. Brazil is the best example. We started [there] four years ago. At first it was very slow growth, but because we were in the market talking to our members who had issues with the service, we could get those things fixed, and we learned faster.

  • Why has Bitcoin failed? From Mike Hearn: it has failed because the community has failed. What was meant to be a new, decentralised form of money that lacked “systemically important institutions” and “too big to fail” has become something even worse: a system completely controlled by just a handful of people. Worse still, the network is on the brink of technical collapse. The mechanisms that should have prevented this outcome have broken down, and as a result there’s no longer much reason to think Bitcoin can actually be better than the existing financial system.

  • Lessons learned on the path to production. From Docker CEO: 1) IaaS is too low; 2) PaaS is too high: Devs do not adopt locked down platforms; 3) End to end matters: Devs care about deployment, ops cares about app lifecycle and origin; 4) Build management, orchestration, & more in a way that enables portability; 5) Build for resilience, not zero defects; 6) If you do 5 right, agility + control

  • Is this the Tesla of database systems? No Compromises: Distributed Transactions with Consistency, Availability, and Performance: FaRMville transactions are processed by FaRM – the Fast Remote Memory system that we first looked at last year. A 90 machine FaRM cluster achieved 4.5 million TPC-C ‘new order’ transactions per second with a 99th percentile latency of 1.9ms. If you’re prepared to run at ‘only’ 4M tps, you can cut that latency in half. Oh, and it can recover from failure in about 60ms. 

  • Uber tells the story behind the design and implementation of their scalable datastore using MySQL. Uber took that path of many others in writing an entire layer on top of MySQL to create the database that best fits their use case. Uber wanted: to be able to linearly add capacity by adding more servers; write availability; a way of notifying downstream dependencies; secondary indexes; operation trust in the system, as it contains mission-critical trip data. They looked at Cassandra, Riak, and MongoDB, etc. Features alone did not decide their choice. What did?: "the decision ultimately came down to operational trust in the system we’d use."  If you are Uber this is a good reason that may not seem as important to those without accountability. Uber's design is inspired by Friendfeed, and the focus on the operational side inspired by Pinterest.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Live Video Streaming At Facebook Scale

Wed, 01/13/2016 - 17:56
With 1.49 billion monthly active users, operating at Facebook scale is far from trivial. Facebook's new live video streaming services present a fascinating use case for designing streaming service in global distribution and massive scale.
Categories: Architecture

A Beginner's Guide to Scaling to 11 Million+ Users on Amazon's AWS

Mon, 01/11/2016 - 17:56

How do you scale a system from one user to more than 11 million users? Joel Williams, Amazon Web Services Solutions Architect, gives an excellent talk on just that subject: AWS re:Invent 2015 Scaling Up to Your First 10 Million Users.

If you are an advanced AWS user this talk is not for you, but it’s a great way to get started if you are new to AWS, new to the cloud, or if you haven’t kept up with with constant stream of new features Amazon keeps pumping out.

As you might expect since this is a talk by Amazon that Amazon services are always front and center as the solution to any problem. Their platform play is impressive and instructive. It's obvious by how the pieces all fit together Amazon has done a great job of mapping out what users need and then making sure they have a product in that space. 

Some of the interesting takeaways:

  • Start with SQL and only move to NoSQL when necessary.
  • A consistent theme is take components and separate them out. This allows those components to scale and fail independently. It applies to breaking up tiers and creating microservices.
  • Only invest in tasks that differentiate you as a business, don't reinvent the wheel.
  • Scalability and redundancy are not two separate concepts, you can often do both at the same time.
  • There's no mention of costs. That would be a good addition to the talk as that is one of the major criticisms of AWS solutions.
The Basics
Categories: Architecture

Uptime Funk - Best Sysadmin Parody Video Ever!

Sun, 01/10/2016 - 18:14

This is so good! Perfect for your Monday morning jam.

 

Uptime Funk is a music video (parody of Uptown Funk) from SUSECon 2015 in Amsterdam. My favorite:  I'm all green (hot patch)
Called a Penguin and Chameleon
I'm all green (hot patch)
Call Torvalds and Kroah-Hartman
It’s too hot (hot patch)
Yo, say my name you know who I am
It’s too hot (hot patch)
I ain't no simple code monkey
Nuthin's down
Categories: Architecture

Stuff The Internet Says On Scalability For January 8th, 2016

Fri, 01/08/2016 - 17:56

Hey, it's HighScalability time:


Finally, a clear diagram of Amazon's industry impact. (MARK A. GARLICK)

 

If you like this Stuff then please consider supporting me on Patreon.
  • 150: # of globular clusters in the Milky Way; 800 million: Facebook Messenger users; 180,000: high-res images of the past; 1 exaflops: 1 million trillion floating-point operations per second; 10%: of Google's traffic is now IPv6; 100 milliseconds: time it takes to remember; 35: percent of all US Internet traffic used by Netflix; 125 million: hours of content delivered each day by Netflix's CDN;

  • Quotable Quotes:
    • Erik DeBenedictis: We could build an exascale computer today, but we might need a nuclear reactor to power it
    • wstrange: What I really wish the cloud providers would do is reduce network egress costs. They seem insanely expensive when compared to dedicated servers.
    • rachellaw: What's fascinating is the bot-bandwagon is mirroring the early app market. With apps, you downloaded things to do things. With bots, you integrate them into things, so they'll do it for you. 
    • erichocean: The situation we're in today with RAM is pretty much the identical situation with the disks of yore.
    • @bernardgolden: @Netflix will spend 2X what HBO does on programming in 2016? That's an amazing stat. 
    • @saschasegan: Huawei's new LTE modem has 18 LTE bands. Qualcomm's dominance of LTE is really ending this year.
    • Unruly Places: The rise of placelessness, on top of the sense that the whole planet is now minutely known and surveilled, has given this dissatisfaction a radical edge, creating an appetite to find places that are off the map and that are somehow secret, or at least have the power to surprise us.
    • @mjpt777: Queues are everywhere. Recognise them, make them first class, model and monitor them for telemetry.
    • Guido de Croon:  the robot exploits the impending instability of its control system to perceive distances. This could be used to determine when to switch off its propellers during landing, for instance.
    • @gaberivera: In the future, all major policy questions will be settled by Twitter debates between venture capitalists
    • Craig McLuckie: It’s not obvious until you start to actually try to run massive numbers of services that you experience an incredible productivity that containers bring
    • Brian Kirsch: One of the biggest things when you look at the benefits of container-based virtualization is its ability to squeeze more and more things onto a single piece of hardware for cost savings. While that is good for budgets, it is excessively horrible when things go bad.
    • @RichardWarburto: It still surprises me that configuration is most popular user of strong consistency models atm. Is config more important than data
    • @jamesurquhart: Five years ago I predicted CFO would stop complaining about up front cost, and start asking to reduce monthly bill. Seeing that happen now.
    • @martinkl: Communities in a nutshell… • Databases research: “In fsync we trust” • Distributed systems research: “In majority vote we trust”
    • @BoingBoing: Tax havens hold $7.6 trillion; 8% of world's total wealth
    • @DrQz: Amazon's actual profits are still tiny, relying heavily on its AWS cloud business.
    • hadagribble: we need to view fast storage as something other than disk behind a block interface and slow memory, especially with all the different flavours of fast persistent storage that seem to be on the horizon. For the one's that attach to the memory bus, the PMFS-style [1] approach of treating them like a file-system for discoverability and then mmaping to allow them to be accessed as memory is pretty attractive.

  • EC2 with a 5% price reduction on certain things in certain places. Not exactly the race to the bottom one would hope for in a commodity market, which means the cloud is not a commodity. Happy New Year – EC2 Price Reduction (C4, M4, and R3 Instances).

  • Since the locus of the Internet is centering on a command line interface in the form of messaging, chatbot integrations may be giving APIs a second life, assuming they are let inside the walled garden. The next big thing in computing is called 'ChatOps,' and it's already happening inside Slack. The advantage chatops has over the old Web + API mashup dream is that messaging platforms come built-in with a business model/app store, large amd growing user base, and network effects. Facebook’s Secret Chat SDK Lets Developers Build Messenger Bots. Slack apps. WeChat API. Telegram API. Alexa API. Google's Voice Actions. How about Siri or iMessage? Nope. njovin likes it: I've worked with the new Chat SDK and our customers' use cases aren't geared toward forcing (or even encouraging) users into using Facebook Messenger. Most of them are just trying to meet demand from their customers. In our particular case, we have customers with a lot of international travelers who have access to data while abroad but not necessarily SMS. IMO it's a lot better than having a dedicated app you have to download to interact with a specific brand.

  • The world watched a lot of porn this year. If you like analytics you'll love Pornhub’s 2015 Year in Review: In 2015 alone, we streamed 75GB of data a second; bandwidth used is 1,892 petabytes; 4,392,486,580 hours of video were watched; 21.2 billion visits.

  • A very interesting way to frame the issue. On the dangers of a blockchain monoculture: The Bitcoin blockchain: the world’s worst database. Would you use a database with these features? Uses approximately the same amount of electricity as could power an average American household for a day per transaction. Supports 3 transactions / second across a global network with millions of CPUs/purpose-built ASICs. Takes over 10 minutes to “commit” a transaction. Doesn’t acknowledge accepted writes: requires you read your writes, but at any given time you may be on a blockchain fork, meaning your write might not actually make it into the “winning” fork of the blockchain (and no, just making it into the mempool doesn’t count). In other words: “blockchain technology” cannot by definition tell you if a given write is ever accepted/committed except by reading it out of the blockchain itself (and even then). Can only be used as a transaction ledger denominated in a single currency, or to store/timestamp a maximum of 80 bytes per transaction. But it’s decentralized!

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Let's Donate Our Organs and Unused Cloud Cycles to Science

Wed, 01/06/2016 - 17:56

There’s a long history of donating spare compute cycles for worthy causes. Most of those efforts were started in the Desktop Age. Now, in the Cloud Age, how can we donate spare compute capacity? How about through a private spot market?

There are cycles to spare. Public Cloud Usage trends:

  • Instances are underutilized with average utilization rates between 8-9%

  • 24% of instance reservations are unused

Maybe all that CapEx sunk into Reserved Instances can be put to some use? Maybe over provisioned instances could be added to the resource pool as well? That’s a lot of power Captain. How could it be put to good use?

There is a need to crunch data. For science. Here’s a great example as described in This is how you count all the trees on Earth. The idea is simple: from satellite pictures count the number of trees. It’s an embarrassingly parallel problem, perfect for the cloud. NASA had a problem. Their cloud is embarrassingly tiny. 400 hypervisors shared amongst many projects. Analysing all the data would would take 10 months. An unthinkable amount of time in this Real-time Age. So they used the spot market on AWS.

The upshot? The test run cost a measly $80, which means that NASA can process data collected for an entire UTM zone for just $250. The cost for all 11 UTM zones in sub-Sarahan Africa and the use of all four satellites comes in at just $11,000.

“We have turned what was a $200,000 job into a $10,000 job and we went from 100 days to 10 days [to complete],” said Hoot. “That is something scientists can build easily into their budget proposals.”

That last quote, That is something scientists can build easily into their budget proposals, stuck in my craw.

Imagine how much science could get done if you didn’t have the budget proposal process slowing down the future? Especially when we know there are so many free cycles available that are already attached to well supported data processing pipelines. How could those cycles be freed up to serve a higher purpose?

Netflix shows the way with their internal spot market. Netflix has so many cloud resources at their disposal, a pool of 12,000 unused reserved instances at peak times, that they created their own internal spot market to drive better utilization. The whole beautiful setup is described Creating Your Own EC2 Spot Market, Creating Your Own EC2 Spot Market -- Part 2, and in High Quality Video Encoding at Scale.

The win: By leveraging the internal spot market Netflix measured the equivalent of a 210% increase in encoding capacity.

Netflix has a long and glorious history of sharing and open sourcing their tools. It seems likely when they perfect their spot market infrastructure it could be made generally available.

Perhaps the Netflix spot market could be extended so unused resources across the Clouds could advertise themselves for automatic integration into a spot market usable by scientists to crunch data and solve important world problems.

Perhaps donated cycles could even be charitable contributions that could help offset the cost of the resource? My wife is a tax accountant and she says this is actually true, under the right circumstances.

This kind of idea has a long history with me. When AWS first started, I like a lot of people wondered, how can I make money off this gold rush? That’s before we knew Amazon was going to make most of the tools to sell to the miners themselves. The idea of exploiting underutilized resources fascinated me for some reason. That is, after all, what VMs do for physical hardware, exploit the underutilized resources of powerful machines. And it is in some ways the idea behind our modern economy. Yet even today software architectures aren’t such that we reach anything close to full utilization of our hardware resources. What I wanted to do was create a memcached system that allowed developers to sell their unused memory capacity (and later CPU, network, storage) to other developers as cheap dynamic pools of memcached storage. Get your cache dirt cheap and developers could make some money back on underused resources. A very similar idea to the spot market notion. But without homomorphic encryption the security issues were daunting, even assuming Amazon would allow it. With the advent of the Container Age sharing a VM is now way more secure and Amazon shouldn’t have a problem with the idea if it’s for science. I hope.

Categories: Architecture