Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
Syndicate content
Updated: 10 hours 35 min ago

Instagram Improved their App's Performance. Here's How.

Mon, 09/29/2014 - 16:56

Is flat design just another pretty face or is it a huge performance hack cloaked as a UI revolution? It turns out flat design is a stone cold performance win.

This and more is expertly explained by Tyler Kieft, Engineer at Instagram, in a crisp and content filled talk he gave at the @scale conferenceInstagram on Typical Android. This talk was part of series of talks given by Facebook on how to design for the reality of mobile applications across the globe, where phones are slower, screens are smaller, and networks are slower than they are in the US.

Designing for a typical phone rather than a high-end phone required the Instagram team to rethink their design in a deep way. One of the revelations in Tyler's talk was that moving to a flat design was huge in making the application more beautiful, more usable, and it also substantially increased performance.

This was quite a surprise. I've only ever thought of flat design as just a way to think about how to build pretty UIs. Silly me. Thanks to Tyler for explaining the benefits of flat design so clearly and forcefully, using Instagram as a great example of what is possible.

Flat design is the anti-skeuomorphism, going digital native, eschewing a slavish obsession with the appearance of reality, adopting simple elements, simple typography, flat colors, and simple designs.

Using flat design Instagram was able shave off 120ms from its cold start times. It was also able to reduce the number of assets it took to display the feed screen from 29 assets down to 8 assets. All while making the application more beautiful, more usable, with giving more focus given to the content across different phone sizes.

How did flat design make all this possible? Please keep on reading...

Categories: Architecture

Stuff The Internet Says On Scalability For September 26th, 2014

Fri, 09/26/2014 - 16:56

Hey, it's HighScalability time:


With tensegrity landing balls we'll be the coolest aliens to ever land on Mars.
  • 6-8Tbps:  Apple’s live video stream; $65B: crowdfunding's contribution to the global economy
  • Quotable Quotes:
    • @bodil: I asked @richhickey and he said "a transducer is just a pre-fused Kleisli arrows in the list monad." #strangeloop
    • @lusis: If you couldn’t handle runit maybe you shouldn’t be f*cking with systemd. You’ll shoot your g*ddamn foot off.
    • Rob Neely: Programming model stability + Regular advances in realized performance = scientific discovery through computation
    • @BenedictEvans: Maybe 5bn PCs have been sold so far. And 17bn mobile phones.
    • @xaprb: "There's no word for the opposite of synergy" @jasonh at #surgecon

  • The SSD Endurance Experiment. The good news: You don't have to worry about writing a lot of data to SSDs anymore. That bad news: When your SSD does die your data may not be safe. Good discussion on Hacker News.

  • Don't have a lot of money? Don't worry. Being cheap can actually create cool: Teleportation was used in Star Trek because the budget couldn't afford expensive shots of spaceships landing on different planets.

  • Not so crazy after all? Google’s Internet “Loon” Balloons Will Ring the Globe within a Year

  • Before cloud and after cloud as told through a car crash

  • Cluster around dear readers, videos from MesosCon 2014 are now available.

  • From Backbone To React: Our Experience Scaling a Web Application. This seems a lot like the approach Facebook uses in their Android apps. As things get complex move the logic to a top level centralized manager and then distribute changes down to components that are not incrementally changed, they are replaced entirely.

  • Deciding between GAE or EC2? This might help: Running a website: Google App Engine vs. Amazon EC2. AWS is hard to set up. Both give you a lot for free. GAE is not customizable. On AWS use whatever languages and software you want. GAE once written your software will scale. If you have a sysadmin or your project requires specific software go with AWS. If you are small or have a static site go with GAE. 

  • Mean vs Lamp – How Do They Stack Up? MEAN = MongoDB, Express.js, Angular.js, PHP or Python. Why be MEAN: the three most significant being a single language from top to bottom, flexibility in deployment platform, and enhanced speed in data retrieval. However, the switch is not without trade-offs; any existing code will either need to be rewritten in JavaScript or integrated into the new stack in a non-obvious manner.  

  • Free the Web: Sometimes, I feel like blaming money. When money comes into play, people start to fear. They fear losing their money, and they fear losing their visitors. And so they focus on making buttons easily clickable (which inevitably narrows down places where they can go), and they focus on making sites that are safe but predictably usable.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

5 Tips for Scaling NoSQL Databases: Don’t Trust Assumptions—Test, Test, Test!

Wed, 09/24/2014 - 17:08

Alex Bordei, product manager for Bigstep’s Full Metal Cloud, in Scaling NoSQL databases: 5 tips for increasing performance, shares a nice set of lessons he's learned about how NoSQL databases scale:

  • Never assume linearity in scaling. Hardware prices grow exponentially as the specs increase, but not all software can take full advantage of all that power. So you may be paying for hardware your database can't use. Find the sweet spot for price and hardware capabilities.
  • Tests speak louder than specs. Don't trust vendor documentation. It's cheap to spin up new instances so test the specs for yourself.
  • Mind the details: Memory & CPU numbers matter. For in-memory databases the specs on your memory modules matter. Faster memory means faster performance. Same for CPU frequencies. Pay attention to what your money is buying.
  • Do not neglect network latency. Paying for fast memory and fast CPU won't do a lot of good if your network is slow. 
  • Avoid virtualization with NoSQL databases. Virtualization can exact a 20-200% performance penalty. Noisy neighbors also help ruin the neighborhood. Up to 400% performance gains can be seen by switching away from virtualization and adopting bare metal clouds.

Lots of good advice. Each of these points in discussed in more detail in the original article, which is well worth reading.

 

Categories: Architecture

How Facebook Makes Mobile Work at Scale for All Phones, on All Screens, on All Networks

Mon, 09/22/2014 - 16:58

Update: Instagram Improved Their App's Performance. Here's How.

When you find your mobile application that ran fine in the US is slow in other countries, how do you fix it? That’s a problem Facebook talks about in a couple of enlightening videos from the @scale conference. Since mobile is eating the world, this is the sort of thing you need to consider with your own apps.

In the US we may complain about our mobile networks, but that’s more #firstworldproblems talk than reality. Mobile networks in other countries can be much slower and cost a lot more. This is the conclusion from Chris Marra, Project Manager at Facebook, in a really interesting talk titled Developing Android Apps for Emerging Market.

Facebook found in the US there’s 70.6% 3G penetration with 280ms average latency. In India there’s 6.9% 3G penetration with 500ms latency. In Brazil there’s 38.6% 3G penetration with more than 850ms average latency.

Chris also talked about Facebook’s comprehensive research on who uses Facebook and what kind of phones they use. In summary they found not everyone is on a fast phone, not everyone has a large screen, and not everyone is on a fast network.

It turns out the typical phone used by Facebook users is from circa 2011, dual core, with less than 1GB of RAM. By designing for a high end phone Facebook found all their low end users, which is the typical user, had poor user experiences.

For the slow phone problem Facebook created a separate application that used lighter weight animations and other strategies to work on lower end phones. For the small screen problem Facebook designers made sure applications were functional at different screen sizes.

Facebook has moved to a product organization. A single vertical group is responsible for producing a particular product rather than having, for example, an Android team try to create all Android products. There’s also a horizontally focussed Android team trying to figure out best practices for Android, delving deep into the details of what makes a platform tick.

Each team is responsible for the end-to-end performance and reliability for their product. There are also core teams looking at and analyzing general performance problems and helping where needed to improve performance.

Both core teams and product teams are needed. The core team is really good at instrumentation and identifying problems and working with product teams to fix them. For mobile it’s important that each team owns their full product end-to-end. Owning core engagement metrics, core reliability, and core performance metrics including daily usage, cold start times, and reliability, while also knowing how to fix problems. 

To solve the slow network problem there’s a whole other talk. This time the talk is given by Andrew Rogers, Engineering Manager at Facebook, and it’s titled Tuning Facebook for Constrained Networks. Andrew talks about three methods to help deal with network problems: Image Download Sizes, Network Quality Detection, Prefetching Content.

Overall, please note the immense effort that is required to operate at Facebook scale. Not only do you have different phones like Android and iOS, you have different segments within each type of phone you must code and design for. This is crazy hard to do.

Reducing Image Sizes -  WebP saved over 30% JPEG, 80% over PNG
Categories: Architecture

Stuff The Internet Says On Scalability For September 19th, 2014

Fri, 09/19/2014 - 16:56

Hey, it's HighScalability time:


Galactic bolt-hole or supermassive black hole, weighing as much as 21 million suns?
  • Quotable Quotes:
    • @debuggist: Chief takeaway from #velocityconf for me: failure happens so monitor for the ones that are important albeit in systems or culture & fix them
    • @Carnage4Life: The real tech bubble is valuations of Google, Facebook & Twitter are inflated by app install ads from unprofitable startups 
    • Jay Parikh: Android has 2x as many users as iOS. However, iOS average revenue per user is 4x higher than Android. 
    • Joe Armstrong: We’ve made a mess. Need to reverse entropy. Quantum mechanics sets limits to ultimate speed of computation. We need Math. Abolish names and places. Build the condenser. Make low-power computers - no net environmental damaged.
    • @reneritchie: iOS 8 is a beefy update. The internet is feeling the strain of millions of downloads. If it’s slow or stuttering, just give it some time.
    • @vishyp: Quip is big on disconnected clients and clients as write-thru caches. Like it! Uses C++ for x-platform. @btaylor #atscale2014
    • @csabacsoma: Instragram: we use Postgres for almost everything now, some Redis and Memcached #AtScale2014
    • @rfberry: "To maximize throughput, we needed an integrated approach to backpressure." #AtScale2014
    • @sharonw: I asked this Facebook / Instagram panel about image upload. Key changes: retry aggressively, and resize and encode on device. #AtScale2014
    • Igor Zaika: Most of the developers who work on Microsoft Office are younger than the codebase. 
    • Chris Marra: There are about 10k different device models accessing Facebook. Designing for high-end smartphones does not cut it.
    • @Carnage4Life: Apple spends $100m on a U2 album you don't want. Microsoft spends that on ads you don't like. Amazon spends it on free shipping #Marketing
    • @beerops: Laziness as a multiplier: train other people to do what you can do to remove yourself as a bottleneck #velocityconf
    • @xaprb: "Failure is a feature of complex systems." - #velocityconf
    • @dalmaer: "Our mobile app is a write through cache (SQLite) to the source of truth (MySQL on AWS)" -- @btaylor #AtScale2014
    • @herminghaus: @BenedictEvans Designed an 11.4TB patent retrieval system in 1993 with slow WORM robots. Cost $140m. Now <$1000.- at BestBuy.
    • @BenedictEvans: You can't ask people to decide on a trade-off when they have experience of one side but not the other.

  • Caching at Scale. There's a need to better manage caching, especially under failure conditions. Solutions are generally in the form of a proxy layer above memcached. Along these lines Box created Tron, Twitter created TwemProxy, and Facebook created a value meal in McRouter. Database people have always countered, why have a separate cache, just build a cache into the database? This hasn't worked for various reasons, mostly because a database always cares more about being a good database rather than being a good cache. Vitess wants to fix that. Vitess is an open-source system written in Go used at Youtube that: challenges the paradigm of treating caching as a separate layer by directly addressing the issues of database scalability and by modifying the handling of SQL queries.  

  • Talk about your Chaos Godzilla. Facebook Turned Off Entire Data Center to Test Resiliency. Before flipping that switch there must have been a little pause, perhaps thinking this wouldn't be prudent, but damn the torpedoes and full speed ahead. Apparently some issues were found, but it went fairly smoothly. Hazaa for the chutzpah.

  • Best LAN party ever? Researchers twist four radio beams together to achieve high data transmission speeds. The researchers reached data transmission rates of 32 gigabits per second across 2.5 meters of free space in a basement lab.

  • This is an understatement. iOS 8, thoroughly reviewed. An amazing job. A big take away for me is Apple is systematically removing reasons not to buy an iPhone. Bigger phone. Check. Configurable keyboard. Check. Extensions that display in the today view and allow app cooperation. Check. Another take away is Apple is abandoning simplicity for configurability, which is embracing complexity, which is a potential experience killer.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

The FireBox Warehouse Scale Computer in 2020 Will Have 1K Sockets, 100K Cores, 100PB NV RAM, and a 4Pb/s Network

Wed, 09/17/2014 - 16:56

That's the eye popping prediction from Krste Asanović, University of California, Berkeley, in a presentation he gave at FAST '14 titled: FireBox: A Hardware Building Block for 2020 Warehouse-Scale Computers (pdf).

FireFox looks system like this:

Trends in Warehouse Scale Computers (WSC)s:
Categories: Architecture

Sponsored Post: Apple, Flipboard, All Your Base, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Tue, 09/16/2014 - 16:56

Who's Hiring?
  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here. 
    • Siri Operations Developer. Apple is looking for talented developers to help build the next generation internal cloud platform for Siri. This person should be excited about solving difficult distributed systems problems as well as constantly improving user-experience. This person will be working with a highly technical and motivated team solving the hard problems. Please apply here.
    • Site Reliability Engineer. The Apple Pay Site Reliability Team is hiring for multiple roles focused on the front line customer experience and the back end integration of Apple systems with our Network and Banking partners. Please apply here.
    • Senior Software Engineer, iTunes Infrastructure. Hands-on senior software engineering for the iTunes digital media supply chain engineering team. We are looking for a self starting, energetic individual who is not afraid to question assumptions and with excellent written and oral communication skills. Please apply here
    • iTunes - Content Management Tools Engineer. The candidate should have several years experience developing large-scale web-based applications using object-oriented languages. Excellent understanding of relational databases and data-modeling techniques is also a must. Please apply here

  • Flipboard's Site Reliability Engineering Team is hiring! This team offers great challenges solving unique problems unlike any you have seen!  They work exclusively in the cloud, ensuring a highly available and performant product to millions of users daily.  If you have a passion for large-scale systems, next generation provisioning and orchestration tools apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • All Your Base is the only curated database conference of its kind in the UK. Listen to talks from database creators, industry leaders and developers working at the coal face on where to store and how to handle your data. Book tickets.
Cool Products and Services
  • FoundationDB launches SQL Layer. SQL Layer is an ANSI SQL engine that stores its data in the FoundationDB Key-Value Store, inheriting its exceptional properties like automatic fault tolerance and scalability. It is best suited for operational (OLTP) applications with high concurrency. Users of the Key Value store will have free access to SQL Layer. SQL Layer is also open source, you can get started with it on GitHub as well.

  • Better, Faster, Cheaper: Pick Three. Scalyr is your universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs”; our columnar data store enables enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – get on board!

  • Whitepaper Clarifies ACID Support in Aerospike. In our latest whitepaper, author and Aerospike VP of Engineering & Operations, Srini Srinivasan, defines ACID support in Aerospike, and explains how Aerospike maintains high consistency by using techniques to reduce the possibility of partitions.  Read the whitepaper: http://www.aerospike.com/docs/architecture/assets/AerospikeACIDSupport.pdf.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Getting Things Right: A Look at Centralized vs Decentralized Systems Through the Eyes of Instant Replay

Mon, 09/15/2014 - 17:06

Three baseball umpires were sitting around a bar, talking about how they make calls on each pitch: First umpire: Some are balls and some are strikes, and I call them as they are. Second umpire: Some are balls and some are strikes, and I call them as I see 'em. Third umpire: Some are balls and some are strikes, but they ain’t nothin' until I call 'em.


AT&T's Global Network Operations Center

 


MLB's Instant Replay Bunker
NHL's Situation Room

It’s fun to look at how concepts we think of as belonging primarily to the domain of computer science play out in other fields. One intriguing example is how Instant Replay reflects and even helps shape the culture of a sport by how replay is implemented: decentralized or centralized.

Lucrative TV deals have pumped huge sums of money into professional sports. With so much money in play, sports have shifted from being pure entertainment to wanting to get things right. The price of making a bad call is just too high to let the human element decide the fate of titans.

Getting things right is also a much talked about subject in computer science. In CS the language of getting things right uses terms like transaction, rollback, quorum, optimistic replication, linearizability, synchronization, lock, eventually consistent, compensating transaction, and so on.

In sports to get things right referees use terms like flag, penalty, by rule, ruling stands, reset the clock, down and distance, line to gain, the whistle blew, ruling confirmed, and ruling overturned.

Though the vocabulary is different, the intent is much the same. Correctness.

Intent is not all tech and sports have in common. As technology evolves we are seeing sports change to take advantage of the new capabilities technology offers. And those changes should be familiar to anyone in software. Sports have gone from a completely decentralized system of officiating to where we now see the NBA, NFL, MLB, and NHL, all converging on some form of a centralized system.

The NHL were the innovators, starting their centralized instant replay system in 2011. It works something like this...officials sit in a war room located in Toronto that looks a lot like every network operations center ever built. Video feeds from all games flow into the room. When there is a controversy or an obvious review-worthy play, Toronto is contacted for a quick review and judgement on the correct call.  Every sport will implement their own centralized replay system in their own way, but that's the gist of it.

We’ve seen the exact same transformation as federated services like email have been replaced with centralized services like Twitter and Facebook. It turns out sports and computer science have some deeper similarities. What might those be?

Categories: Architecture

Stuff The Internet Says On Scalability For September 12th, 2014

Fri, 09/12/2014 - 16:56

Hey, it's HighScalability time:


Each dot in this image is an entire galaxy containing billions of stars. What's in there?
  • Quotable Quotes:
    • mseepgood: Or "another language that's becoming popular, Node.js"
    • Joe Moreno: What good are billions of cycles of CPU power that make me wait. I shouldn't have to wait longer and longer due to launching, buffering, syncing, I/O and latency.
    • @stevecheney: Apple Pay is the magic that integrated hardware / software produces. No one else in the world can do this.
    • @etherealmind: Next gen Intel Xeon E5 V3 CPU includes packet processor for 40GBE, 30x increase in OpenSSL crypto, 25% increase in DPDK perf. #IDF14
    • @pbailis: There's actually an interesting question in understanding when to break "sharing" -- at core, NUMA domain, server, or cluster level?
    • @fmueller_bln: Just wait some minutes for vagrant to provision a vm with puppet and you’ll know why docker may be better option for dev machines...

  • Encryption will make fighting the spam war much costlier reveals Mike Hearn in an awesome post: A brief history of the spam war, where he gives insightful color commentary of the punch counter punch between World Heavyweight Champion Google and the challenger, Clever Spammer.  Mike worked in the Gmail trenches for over four years and recommends: make sending email cost money; use money to create deposits using bitcoin. 

  • jeswin: No other browser can practically implement or support Dart. If they do their implementation will be slower than Google's, and will get classified as inferior. < Ignoring the merits of Dart, this is an interesting ecosystem effect. By rating sites for non quality of content reasons Google can in effect select for characteristics over which they have a comparative advantage. It's not an arms length transaction. 

  • Dateline Seattle. Social media users execute a coordinated denial of service attack on cell networks, preventing those in need from accessing emergency services. Who are these terrorists? Football fans. City of Seattle asks people to stop streaming videos, posting photos because of football. Tweets, Instagram, YouTube, and Snapchat are overloading the cell networks so calls can't get through. Should the cell network expand capacity? Should there be an app tax to constrain demand? Should users pay per packet? As a 49ers fan I have another suggestion...move games to a different venue, perhaps the moon. That will help.

  • Are you a militant cable cutter who thinks the future of  TV is the Internet? Not so fast says Dan Rayburn in Internet Traffic Records Could Be Broken This Week Thanks To Apple, NFL, Sony, Xbox, EA and Others: Delivering video over the Internet at the same scale and quality that you do over a cable network isn’t possible. The Internet is not a cable network and if you think otherwise, you will be proven wrong this week. We’re going to see long download times, more buffering of streams, more QoS issues and ISPs that will take steps to deal with the traffic. 

  • Ted Nelson takes on the impossible in on How Bitcoin Actually Works (Computers for Cynics #7). And he does an excellent job, sharing his usual insight with a twist. The title is misleading however. There's hardly any cynicism. How disappointing! Ted is clearly impressed with the design and implementation of bitcoin. For good reason. No matter what you think of bitcoin and its potential role in society, it is a very well thought out and impressive piece of technology. On par with Newton, Mr. Nelson suggests. If you watch this you'll probably realize that you don't actually understand bitcoin, even if you think you do, and that's a good thing.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

10 Common Server Setups For Your Web Application

Wed, 09/10/2014 - 16:56

If you need a good overview of different ways to setup your web service then Mitchell Anicas has written a good article for you: 5 Common Server Setups For Your Web Application.

We've even included a few additional possibilities at no extra cost.

  1. Everything on One Server. Simple. Potential for poor performance because of resource contention. Not horizontally scalable. 
  2. Separate Database Server. There's an application server and a database server. Application and database don't share resources. Can independently vertically scale each component. Increases latency because the database is a network hop away.
  3. Load Balancer (Reverse Proxy). Distribute workload across multiple servers. Native horizontal scaling. Protection against DDOS attacks using rules. Adds complexity. Can be a performance bottleneck. Complicates issues like SSL termination and stick sessions.
  4. HTTP Accelerator (Caching Reverse Proxy). Caches web responses in memory so they can be served faster. Reduces CPU load on web server. Compression reduces bandwidth requirements. Requires tuning. A low cache-hit rate could reduce performance. 
  5. Master-Slave Database Replication. Can improve read and write performance. Adds a lot of complexity and failure modes.
  6. Load Balancer + Cache + Replication. Combines load balancing the caching servers and the application servers, along with database replication. Nice explanation in the article.
  7. Database-as-a-Service (DBaaS). Let someone else run the database for you.  RDS is one example from Amazon and there are hosted versions of many popular databases.
  8. Backend as a Service (BaaS). If you are writing a mobile application and you don't want to deal with the backend component then let someone else do it for you. Just concentrate on the mobile platform. That's hard enough. Parse and Firebase are popular examples, but there are many more.
  9. Platform as a Service (PaaS). Let someone else run most of your backend, but you get more flexibility than you have with BaaS to build your own application. Google App Engine, Heroku, and Salesforce are popular examples, but there are many more.
  10. Let Somone Else Do it. Do you really need servers at all? If you have a store then a service like Etsy saves a lot of work for very little cost. Does someone already do what you need done? Can you leverage it?
Categories: Architecture

How Twitter Uses Redis to Scale - 105TB RAM, 39MM QPS, 10,000+ Instances

Mon, 09/08/2014 - 17:05

Yao Yue has worked on Twitter’s Cache team since 2010. She recently gave a really great talk: Scaling Redis at Twitter. It’s about Redis of course, but it's not just about Redis.

Yao has worked at Twitter for a few years. She's seen some things. She’s watched the growth of the cache service at Twitter explode from it being used by just one project to nearly a hundred projects using it. That's many thousands of machines, many clusters, and many terabytes of RAM.

It's clear from her talk that's she's coming from a place of real personal experience and that shines through in the practical way she explores issues. It's a talk well worth watching.

As you might expect, Twitter has a lot of cache.

Timeline Service for one datacenter using Hybrid List:
  • ~40TB allocated heap
  • ~30MM qps
  • > 6,000 instances
Use of BTree in one datacenter:
  • ~65TB allocated heap
  • ~9MM qps
  • >4,000 instances

You'll learn more about BTree and Hybrid List later in the post.

A couple of points stood out:

  • Redis is a brilliant idea because it takes underutilized resources on servers and turns them into valuable service.
  • Twitter specialized Redis with two new data types that fit their use cases perfectly. So they got the performance they needed, but it locked them into an older code based and made it hard to merge in new features. I have to wonder, why use Redis for this sort of thing? Just create a timeline service using your own datastructures. Does Redis really add anything to the party?
  • Summarize large chunks of log data on the node, using your local CPU power, before saturating the network.
  • If you want something that’s high performance separate the fast path, which is the data path, away from the slow path, which is the command and control path. 
  • Twitter is moving towards a container environment with Mesos as the job scheduler. This is still a new approach so it's interesting to hear about how it works. One issue is the Mesos wastage problem that stems from requirement to specify hard resource usage limits in a complicated runtime world.
  • A central cluster manager is really important to keep a cluster in a state that’s easy to understand.
  • The JVM is slow and C is fast. Their cache proxy layer is moving back to C/C++.
With that in mind, let's learn more about how Redis is used at Twitter:

Why Redis?
Categories: Architecture

Stuff The Internet Says On Scalability For September 5th, 2014

Fri, 09/05/2014 - 16:56

Hey, it's HighScalability time:


Telephone Tower, late 1880s, 5000 telephone lines. Switching FTW.
  • 1.3 trillion: row table in SQL server; 100,000: galaxies in the Laniakea supercluster.
  • Quotable Quotes:
    • @pbailis: OLAP: data at rest, queries in motion. Stream processing: data in motion, queries at rest. PSoup: data in motion, queries in motion.
    • @ronpepsi: Scaling rule: addressing one bottleneck always starts the clock ticking on another one. (The same goes for weak links in chains.)
    • @utahkay: Our mental models are deterministic, and break down when you reach high utilization in a stochastic system. 

  • Instagram introduced Hyperlapse, their answer to a world that doesn't move fast enough already. And here's the story of how they did it: The Technology behind Hyperlapse from Instagram. It combines time travel and psychadelics, I think you'll enjoy it.

  • Etsy CEO to Businesses: If Net Neutrality Perishes, We Will Too. The idea of being a common carrier is old, deep, and powerful. It creates markets that grow rather than monopolies the choke economies to death. Ferries were required to be common carriers, that is they must ferry all people and goods at the same price.  Otherwise communities would not survive. AT&T became a monopoly on the promise of universal service and becoming a common carrier for all. The Internet is a more important version of the same idea.

  • To make lots and lots of money you need to hitch your star to a fast growing something. Google placed ads on an exponentially expanding inventory of 3rd party web content. Winner. Now Google is exploiting another phenomena experiencing an exponential growth curve: data. This time they aren't placing ads, they are calculating functions with BigQuery. Put On Your Streaming Shoes is a story showing just why and how this jump to another fast growing something will likely succeed.

  • Just an incredible look into the structure behind PhotoGate. Notes on the Celebrity Data Theft. These aren't just script kiddies. These are sophisticated and organized groups. Are hacker networks the new roving band of Vikings looking to rape and pillage? Though it would help if the villages were better protected.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Strategy: Change the Problem

Wed, 09/03/2014 - 17:04

James T. Kirk's infamous gambit in Starfleet's impossible to win Kobayashi Maru test was to redefine the problem into a challenge he could beat. 

Interestingly, an article titled Shifts In Algorithm Design, says something like the same gambit is the modern method of solving algorithmic problems.

In the past: 

I, Dick, recall the “good old days of theory.” When I first started working in theory—a sort of double meaning—I could only use deterministic methods. I needed to get the exact answer, no approximations. I had to solve the problem that I was given—no changing the problem.

 

In the good old days of theory, we got a problem, we worked on it, and sometimes we solved it. Nothing shifty, no changing the problem or modifying the goal. 

Today:
Categories: Architecture

Sponsored Post: Apple, Scalyr, Tumblr, Gawker, FoundationDB, CopperEgg, Logentries, BlueStripe, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Tue, 09/02/2014 - 16:56

Who's Hiring?
  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here. 
    • Site Reliability Engineer. The iOS Systems team is building out a Site Reliability organization. In this role you will be expected to work hand-in-hand with the teams across all phases of the project lifecycle to support systems and to take ownership as they move from QA through integrated testing, certification and production.  Please apply here.
    • Server Software Engineer - Maps Community. As an engineer woking on Maps Community services, your primary responsibility will be backend server software development for the services that power our data crowdsourcing efforts. You’ll be part of a small team working in Java and Scala to add new features and improve our core infrastructure, leveraging best-of-breed frameworks for scalable distributed computing. Please apply here

  • Make Tumblr fast, reliable and available for hundreds of millions of visitors and tens of millions of users. As a Site Reliability Engineer you are a software developer with a love of highly performant, fault-tolerant, massively distributed systems. Apply here.

  • Systems & Networking Lead at Gawker. We are looking for someone to take the initiative on the lowest layers of the Kinja platform. All the way down to power and up through hardware, networking, load-balancing, provisioning and base-configuration. The goal for this quarter is a roughly 30% capacity expansion, and the goal for next quarter will be a rolling CentOS7 upgrade as well as to planning/quoting/pitching our 2015 footprint and budget. For the full job spec and to apply, click here: http://grnh.se/t8rfbw

  • FoundationDB is seeking outstanding developers to join our growing team and help us build the next generation of transactional database technology. You will work with a team of exceptional engineers with backgrounds from top CS programs and successful startups. We don’t just write software. We build our own simulations, test tools, and even languages to write better software. We are well-funded, offer competitive salaries and option grants. Interested? You can learn more here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Your event here.
Cool Products and Services
  • Better, Faster, Cheaper: Pick Three. Scalyr is your universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs”; our columnar data store enables enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – get on board!

  • CopperEgg. Simple, Affordable Cloud Monitoring. CopperEgg gives you instant visibility into all of your cloud-hosted servers and applications. Cloud monitoring has never been so easy: lightweight, elastic monitoring; root cause analysis; data visualization; smart alerts. Get Started Now.

  • Whitepaper Clarifies ACID Support in Aerospike. In our latest whitepaper, author and Aerospike VP of Engineering & Operations, Srini Srinivasan, defines ACID support in Aerospike, and explains how Aerospike maintains high consistency by using techniques to reduce the possibility of partitions.  Read the whitepaper: http://www.aerospike.com/docs/architecture/assets/AerospikeACIDSupport.pdf.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Let's Build Maker Cities for Maker People Around New Resources Like Bandwidth, Compute, and Atomically-Precise Manufacturing

Mon, 09/01/2014 - 17:05

TL;DR: There’s a lot of unused space in North America. Yet cities like San Francisco are becoming ever more expensive because of a bubble created by high tech jobs that seemingly can be done anywhere. Historically cities are built around resources that provide some service to humans. The age of infrastructure rising around physical resources is declining while the age of digital resource exploitation is rising. Cities are still valuable because they are amazing idea and problem solving machines. How about we create thousands of new Maker Cities in the vast emptiness that is North America and build them around digital resources like bandwidth, compute power, Atomically-Precise Manufacturing (AMP), and all things future and bright?

Observation Number One: There’s lots of empty space out there.
Categories: Architecture