Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
Syndicate content
Updated: 4 hours 27 min ago

Stuff The Internet Says On Scalability For March 24th, 2017

Fri, 03/24/2017 - 16:56

Hey, it's HighScalability time:

 This is real and oh so eerie. Custom microscope takes a 33 hour time lapse of a tadpole egg dividing.
If you like this sort of Stuff then please support me on Patreon.
  • 40Gbit/s: indoor optical wireless networks; 15%: energy produced by wind in Europe; 5: new tasty particles; 2000: Qubits are easy; 30 minutes: flight time for electric helicopter; 42.9%: of heathen StackOverflowers prefer tabs;

  • Quotable Quotes:
    • @RichRogersIoT: "Did you know? The collective noun for a group of programmers is a merge-conflict." - @omervk
    • @tjholowaychuk: reviewed my dad's company AWS expenses, devs love over-provisioning, by like 90% too, guess that's where "serverless" cost savings come in
    • @karpathy: Nature is evolving ~7 billion ~10 PetaFLOP NI agents in parallel, and has been for ~10M+s of years, in a very realistic simulator. Not fair.
    • @rbranson: This is funny, but legit. Production software tends to be ugly because production is ugly. The ugliness outpaces our ability to abstract it.
    • @joeweinman: @harrietgreen1 : Watson IoT center opened in Munich... $200 million dollar investment; 1000 engineers #ibminterconnect
    • David Gerard: This [IBM Blockchain Service] is bollocks all the way down.
    • digi_owl: Sometimes it seems that the diff between a CPU and a cluster is the suffix put on the latency times.
    • Scott Aaronson: I’m at an It from Qubit meeting at Stanford, where everyone is talking about how to map quantum theories of gravity to quantum circuits acting on finite sets of qubits, and the questions in quantum circuit complexity that are thereby raised.
    • Founder Collective: Firebase didn’t try to do everything at once. Instead, they focused on a few core problems and executed brilliantly. “We built a nice syntax with sugar on top,” says Tamplin. “We made real-time possible and delightful.” It is a reminder that entrepreneurs can rapidly add value to the ecosystem if they really focus.
    • Elizabeth Kolbert: Reason developed not to enable us to solve abstract, logical problems or even to help us draw conclusions from unfamiliar data; rather, it developed to resolve the problems posed by living in collaborative groups. 
    • Western Union: the ‘telephone’ has too many shortcomings to be seriously considered as a means of communication.
    • Arthur Doskow: being fair, being humane may cost money. And this is the real issue with many algorithms. In economists’ terms, the inhumanity associated with an algorithm could be referred to as an externality. 
    • Francis: The point is that even if GPUs will support lower precision data types exclusively for AI, ML and DNN, they will still carry the big overhead of the graphics pipeline, hence lower efficiency than an FPGA (in terms of FLOPS/WATT). The winner? Dedicated AI processors, e.g. Google TPU
    • James Glasnapp: When we move out of the physical space to a technological one, how is the concept of a “line” assessed by the customer who can’t actually see the line? 
    • Frank: On the other hand, if institutionalized slavery still existed, factories would be looking at around $7,500 in annual costs for housing, food and healthcare per “worker”.
    • Baron Schwartz: If anyone thought that NoSQL was just a flare-up and it’s died down now, they were wrong...In my opinion, three important areas where markets aren’t being satisfied by relational technologies are relational and SQL backwardness, time series, and streaming data. 
    • CJefferson: The problem is, people tell me that if I just learn Haskell, Idris, Closure, Coffescript, Rust, C++17, C#, F#, Swift, D, Lua, Scala, Ruby, Python, Lisp, Scheme, Julia, Emacs Lisp, Vimscript, Smalltalk, Tcl, Verilog, Perl, Go... then I'll finally find 'programming nirvana'.
    • @spectatorindex: Scientists had to delete Urban Dictionary's data from the memory of IBM's Watson, because it was learning to swear in its answers.
    • Animats: [Homomorphically Encrypted Deep Learning] is a way for someone to run a trained network on their own machine without being able to extract the parameters of the network. That's DRM.
    • Dino Dai Zovi: Attackers will take the least cost path through an attack graph from their start node to their goal node.
    • @hshaban: JUST IN: Senate votes to repeal web privacy rules, allowing broadband providers to sell customer data w/o consent including browsing history
    • KBZX5000: The biggest problem you face, as a student, when taking a programming course at a University level, is that the commercially applicable part of it is very limited in scope. You tend to become decent at writhing algorithms. A somewhat dubious skill, unless you are extremely gifted in mathematics and / or somehow have access to current or unique hardware IP's (IP as in Intellectual Property).
    • Brian Bailey: The increase in complexity of the power delivery network (PDN) is starting to outpace increases in functional complexity, adding to the already escalating costs of modern chips. With no signs of slowdown, designers have to ensure that overdesign and margining do not eat up all of the profit margin.
    • rbanffy: Those old enough will remember the AS/400 (now called iSeries) computers map all storage to a single address space. You had no disk - you had just an address space that encompassed everything and an OS that dealt with that.
    • @disruptivedean: Biggest source of latency in mobile networks isn't milliseconds in core, it's months or years to get new cell sites / coverage installed
    • Greg Ferro: Why Is 40G Ethernet Obsolete? Short Answer: COST. The primary issue is that 40G Ethernet uses 4x10G signalling lanes. On UTP, 40G uses 4 pairs at 10G each. 
    • @adriaanm: "We chose Scala as the language because we wanted the latest features of Spark, as well as [...] types, closures, immutability [...]"Adriaan Moors added,
    • ajamesm: There's a difference between (A) locking (waiting, really) on access to a critical section (where you spinlock, yield your thread, etc.) and (B) locking the processor to safely execute a synchronization primitive (mutexes/semaphores).
    • @evan2645: "Chaos doesn't cause problems, it reveals them" - @nora_js #SREcon17Americas #SRECon17
    • chrissnell: We've been running large ES clusters here at Revinate for about four years now. I've found the sweet spot to be about 14-16 data nodes, plus three master-only nodes. Right now, we're running them under OpenStack on top of our own bare metal with SAS disks. It works well but I have been working on a plan to migrate them to live under Kubernetes like the rest of our infrastructure. I think the answer is to put them in StatefulSets with local hostPath volumes on SSD.
    • @beaucronin: Major recurring theme of deep learning twitter is how even those 100% dedicated to the field can't keep up with progress.
    • Chris McNab: VPN certificates and keys are often found within and lifted from email, ticketing, and chat services.
    • @bodil: And it took two hours where the Rust version has taken three days and I'm still not sure it works.
    • azirbel: One thing that's generalizable (though maybe obvious) is to explicitly define the SLAs for each microservice. There were a few weeks where we gave ourselves paging errors every time a smaller service had a deploy or went down due to unimportant errors.
    • bigzen: I'm worn out on articles dissing the performance of SQL databases without quoting any hard numbers and then proceeding to replace the systems with no thanks of development in the latest and great tech. I have nothing against spark, but I find it very hard to believe that alarm code is now readable than SQL. In fact, my experience is just the opposite.
    • jhgg: We are experimenting with webworkers to power a very complicated autocomplete and scoring system in our client. So far so good. We're able to keep the UI running at 60fps while we match, score and sort results in a web-worker.
    • DoubleGlazing: NoSQL doesn't reduce development effort. What you gain from not having to worry about modifying schemas and enforcing referential integrity, you lose from having to add more code to your app to check that a DB document has a certain value. In essence you are moving responsibility for data integrity away from the DB and in to your app, something I think is quite dangerous.
    • Const-me: Too bad many computer scientists who write books about those algorithms prefer to view RAM in an old-fashioned way, as fast and byte-addressable.
    • Azur: It always annoys me a bit when tardigrades are described as extremely hardy: they are not. It is ONLY in the desiccated, cryptobiotic, form they are resistant to adverse conditions.
    • rebootthesystem: Hardware engineers can design FPGA-based hardware optimized for ML. A second set of engineers then uses these boards/FPGA's just as they would GPU's. They write code in whatever language to use them as ML co-processors. This second group doesn't have to be composed of hardware engineers. Today someone using a GPU doesn't have to be a hardware engineer who knows how to design a GPU. Same thing.

  • There should be some sort of Metcalfe's law for events. Maybe: the value of a platform is proportional to the square of the number of scriptable events emitted by unconnected services in the system. CloudWatch Events Now Supports AWS Step Functions as a Target@ben11kehoe: This is *really* useful: Automate your incident response processes with bulletproof state machines #aws

  • Cute faux O'Reilly book cover. Solving Imaginary Scaling Issues.

  • Intel's Optane SSD is finally out, though not quite meeting it's initial this will change everything promise, it still might change a lot of things. Intel’s first Optane SSD: 375GB that you can also use as RAM. 10x DRAM latency. 1/1000 NAND latency. 2400MB/s read, 2000MB/s write. 30 full-drive writes per day. 2.5x better density. $4/GB (1/2 RAM cost). 1.5TB capacity. 500k mixed random IOPS. Great random write response. Targeted at power users with big files, like databases. NDAs are still in place so there's more to learn later. PCPerspective: comparing a server with 768GB of DRAM to one with 128GB of DRAM combined with a pair of P4800X's, 80% of the transactions per second were possible (with 1/6th of the DRAM). More impressive was that matrix multiplication of the data saw a 1.1x *increase* in performance. This seems impossible, as Optane is still slower than DRAM, but the key here was that in the case of the DRAM-only configuration, half of the database was hanging off of the 'wrong' CPU.  foboz1: For anyone think that this a solution looking for a problem, think about two things: Big Data and mobile/embedded. Big Data has an endless appetite for large quantities for memory and fast storage; 3D XPoint plays into the memory hierarchy nicely. At the extreme other end of the scale, it may be fast enough to obviate the need for having DRAM+NAND in some applications. raxx7: And 3D XPoint isn't free of limitations yet. RAM has 50-100 ns latency, 50 GB/s bandwidth (128 bit interface) and unlimited write endurance. If 3D XPoint NVDIMM can't deliver this, we'll still need to manage the difference between RAM and 3D XPoint NVDIMM. zogus: The real breakthrough will come, I think, when the OS and applications are re-written so that they no longer assume that a computer's memory consists of a small, fast RAM bank and a huge, slow persistent set of storage--a model that had held true since just about forever. VertexMaster: Given that DRAM is currently an order of magnitude faster (and several orders vs this real-world x-point product) I really have a hard time seeing where this fits in. sologoub: we built a system using Druid as the primary store of reporting data. The setup worked amazingly well with the size/cardinality of the data we had, but was constantly bottlenecked at paging segments in and out of RAM. Economically, we just couldn't justify a system with RAM big enough to hold the primary dataset...I don't have access to the original planning calculations anymore, but 375GB at $1520 would definitely have been a game changer in terms of performance/$, and I suspect be good enough to make the end user feel like the entire dataset was in memory.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Stuff The Internet Says On Scalability For March 17th, 2017

Fri, 03/17/2017 - 17:08

Hey, it's HighScalability time:

 

Can it be a coincidence trapping autonomous cars is exactly how demons are trapped on Supernatural?
If you like this sort of Stuff then please support me on Patreon.
  • billion billion: exascale operations per second; 250ms: connection time saved by zero round trip time resumption; 800 Million: tons of prey eaten by spiders; 90%: accuracy of quantum computer recognizing trees; 80 GB/s: S3 across 2800 simultaneous functions;

  • Quotable Quotes:
    • @GossiTheDog: Here's something to add to your security threat model: backups. Why steal live data and when you can drive away with exact replica?
    • @ThePublicSquare: "California produces 160% of its 1990 manufacturing, but with just 60% of the workers." -@uclaanderson economist Jerry Nickelsburg
    • @rbranson: makes total sense. I have a friend (who is VC-backed) that has stuff in Azure, GCloud, and AWS to maximize the free credits.
    • @AndrewYNg: If not for US govt funding (DARPA, NSF), US wouldn't be an AI leader today. Proposed cuts to science is big step in wrong direction.
    • @CodeWisdom: "To understand a program you must become both the machine and the program." - Alan Perlis 
    • @codemanship: What does it take to achieve Continuous Delivery? 1. Continuous testing. e.g., Google have 4.2M automated tests, run avg of 35x a day
    • @sebastianstadil: Azure Storage services are down. They really are doing everything like AWS.
Categories: Architecture

Architecture of Probot - My Slack and Messenger Bot for Answering Questions

Wed, 03/15/2017 - 17:25

I programmed a thing. It’s called Probot. Probot is a quick and easy way to get high quality answers to your accounting and tax questions. Probot will find a real live expert to answer your question and handle all the details. You can get your questions answered over Facebook Messenger, Slack, or the web. Answers start at $10. That’s the pitch.

Seems like a natural in this new age of bots, doesn’t it? I thought so anyway. Not so much (so far), but more on that later.

I think Probot is interesting enough to cover because it’s a good example of how one programmer--me---can accomplish quite a lot using today’s infrastructure.

All this newfangled cloud/serverless/services stuff does in fact work. I was able to program a system spanning Messenger, Slack, and the web, in a way that is relatively scalabile, available, and affordable, while requiring minimal devops.

Gone are the days of worrying about VPS limits, driving down to a colo site to check on a sick server, or even worrying about auto-scaling clusters of containers/VMs. At least for many use cases.

Many years of programming experience and writing this blog is no protection against making mistakes. I made a lot of stupid stupid mistakes along the way, but I’m happy with what I came up with in the end.

Here’s how Probot works....

Platform
Categories: Architecture

Sponsored Post: Aerospike, Loupe, Clubhouse, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net

Tue, 03/14/2017 - 16:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Your event here!
Cool Products and Services
  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-first CMS that quickly integrates into your app. Rapidly build CMS-powered experiences in any programming language. Great for blogs, marketing pages, knowledge bases, and more. Butter plays well with Ruby, Rails, Node.js, Go, PHP, Laravel, Python, Flask, Django, and more.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Stuff The Internet Says On Scalability For March 10th, 2017

Fri, 03/10/2017 - 17:56

Hey, it's HighScalability time:

 

Darknet is 4x more resilient than the Internet. An apt metaphor? (URV)
If you like this sort of Stuff then please support me on Patreon.
  • > 5 9s: Spanner availability; 200MB: random access from DNA storage; 215 Pbytes/gram: DNA storage; 287,024: Google commits to open source; 42: hours of audio gold; 33: minutes to get back into programming after interruption; 12K: Chinese startups started per day; 35 million: tons of good shipped under Golden Gate Bridge; 209: mph all-electric Corvette; 500: Disney projects in the cloud; 40%: rise in CO2; 

  • Quoteable Quotes:
    • Marc Rogers: Anything man can make man can break
    • @manupaisable: 10% of machines @spotify rebooted every hour because of defunct #docker - war stories by @i_maravic @qconlondon
    • @robertcottrell: “the energy cost of each bitcoin transaction is enough to power 3.17 US households for a day”
    • Eric Schmidt: We put $30 billion into this platform. I know this because I approved it. Why replicate that?
    • dim: It uses p30 technology. Just basic things, gliders and lightweight spaceships. Basically, the design goes top-down: At the very top, there's the clock. It is a 11520 period clock. Note that you need about 10.000 generations to ensure the display is updated appropriately, but the design should still be stable with a clock of smaller period (about 5.000 or so - the clock needs to be multiple of 60).
    • Luke de Oliveira: Most people in AI forget that the hardest part of building a new AI solution or product is not the AI or algorithms — it’s the data collection and labeling. Standard datasets can be used as validation or a good starting point for building a more tailored solution.
    • @violetblue: Did a lot of people not know that the CIA is a spy agency?
    • @viktorklang: Async is not about *performance*—it is about *scalability*. Let your friends know
    • stillsut: The difference is in the old days, you adapted to computer. Now, computer must adapt to you.
    • Eric Brewer: Spanner uses two-phase commit to achieve serializability, but it uses TrueTime for external consistency, consistent reads without locking, and consistent snapshots.
    • Emily Waltz: Nomura’s molecular robot differs in that it is composed entirely of biological and chemical components, moves like a cell, and is controlled by DNA.
    • Chris Anderson: Most of the devices in our life, from our cars to our homes, are “entropic,” which is to say they get worse over time. Every day they become more outmoded. But phones and drones are “negentropic” devices. Because they are connected, they get better, because the value comes from the software, not hardware
    • William Dutton: Most people using the internet are actually more social than those who are not using the internet
    • @swardley:  ... by 2016, you should have dabbled / learn / tested serverless.  "Go IaaS" or "build our biz as a cloud" in 2017 is #facepalm
    • Bradford Cross: The incompetent segment: the incompetent segment isn’t going to get machine learning to work by using APIs. They are going to buy applications that solve much higher level problems. Machine learning will just be part of how they solve the problems.
    • @denormalize: What do we want? Machine readable metadata! When do we want it? ERROR Line 1: Unexpected token `
    • @Ocramius: "And we should get rid of users: users are not pure, since they modify the state of our system" #confoo
    • Morning Paper: The most important overarching lesson from our study is this: a single file-system fault can induce catastrophic outcomes in most modern distributed storage systems. 
    • Linus Torvalds: And if the DRM "maintenance" is about sending me random half-arsed crap in one big pull, I'm just not willing to deal with it. This is like the crazy ARM tree used to be.
    • Shaun McCormick: Technical Debt is a Positive and Necessary Step in software engineering
    • @tdierks: Hello, my name is Tim. I'm a lead at Google with over 30 years coding experience and I need to look up how to get length of a python string.
    • @codinghorror: I colocated a $600 Ali Express mini pc for $15/month and it is 2x faster than "the cloud"
    • @antirez: "Group chat is like being in an all-day meeting with random participants and no agenda".
    • @sriramhere: Wise man once wrote "As flexible as it is, compute in AWS is optimized for the old capex world." @sallamar
    • @wattersjames: AI will come to your company carefully disguised as a lot of ETL and data-pipeline work...
    • ceejayoz: Lambda's billed in 100 millisecond increments. EC2 servers are billed in one hour increments. If you need short tasks that run in bursty workloads, Lambda's (potentially) a no-brainer.
    • @codinghorror: we have not found bare metal colocation to be difficult, with one exception: persistent file storage. That part, strangely, is quite hard.
    • @jbeda: Lesson from 10 years at Google: this is true until it isn't. Sometimes you *can* build a better mouse trap.
    • jfoutz: I agree. It's genius in a Lex Luthor kind of way. If I understood the full scope of the application, I like to think i'd decline to work on that. It's easy to imagine engineers working on small parts of the system, and never really connecting the dots that the whole point is to evade law enforcement.
    • dsr_:  It's harder (but not impossible) to have complete service lossage like this [Slack] in a federated protocol. That's why you didn't hear about the great email collapse of 2006.
    • throw_away_777: I agree that neural nets are state-of-the-art and do quite well on certain types of problems (NLP and vision, which are important problems). But a lot of data is structured (sales, churn, recommendations, etc), and it is so much easier to train an xgboost model than a neural net model. 
    • @GossiTheDog: #Vault7 CIA - Wiki that Wikileaks released is/was on hosted on DEVLAN, the CIA's "dirty" development network - a major architecture error.
    • Alison Gopnik: new studies suggest that both the young and the old may be especially adapted to receive and transmit wisdom. We may have a wider focus and a greater openness to experience when we are young or old than we do in the hurly-burly of feeding, fighting and reproduction that preoccupies our middle years.
    • @pierre: Wow, audacious to say the least. Intentionally flagging authorities to mislead them. It's like the VW emissions code
    • Joan Gamell: Starting with the obvious: the CIA uses JIRA, Confluence and git. Yes, the very same tools you use every day and love/hate. 
    • Chris Baraniuk: The networks of genes in each animal is a bit like the network of neurons in our brains, which suggests they might be "learning" as they go
    • futurePrimitive: Managers seem to think that programming is typing. No. Programming is *thinking*. The stuff that *looks* like work to a manager (energetic typing) only happens after the hard work is done silently in your head.
    • @danielbryantuk: "There is no such thing as a 'stateless' architecture. It's just someone else's problem" @jboner #qconlondon
    • Platypus: There's no panacea for vendor lock-in. Not even open source, but open source alone gets you further than any number of standards that don't cover what really matters or vendor-provided tools that might go away at any moment. It's the first and best tool for dealing with lock-in, even if it's not perfect. 
    • @tpuddle: @cliff_click talk at #qconlondon about fraud detection in financial trades. Searching 1 billion trades a day "is not that big". !
    • @charleshumble: "Something I see in about 95% of the trading data sets is there are a small number of bad guys hammering it." Cliff Click #qconlondon

  • You may not be able to hear doves cry, but you can listen to machines talk. Elevators to be precise. Watch them chat away as they selflessly shuttle to and fro. Yes, it is as exciting as you might imagine. Though probably not very different than the interior dialogue of your average tool.

  • It used to be that winners wrote history. Now victors destroy data. Terabytes of Government Data Copied

  • Battling legacy code seems to be the number one issue on Stack Overflow, as determined by top books mentioned on Stack Overflow. Not surprising. What was surprising is what's not on the list: algorithm books. Books on the craft of programming took top honors. Gratifying, but at odds with current interviewing dogma. The top 10 books: Working Effectively with Legacy Code; Design Patterns; Clean Code; Java concurrency in practice; Domain-driven Design; JavaScript; Patterns of Enterprise Application Architecture;  Code Complete; Refactoring; Head First Design Patterns.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Part 4 of Thinking Serverless —  Addressing Security Issues

Mon, 03/06/2017 - 17:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1 and 2 and 3

This post is the last of a four-part series of that will dive into developing applications in a serverless way. These insights are derived from several years working with hundreds of developers while they built and operated serverless applications and functions.

The platform was the serverless platform from Iron.io but these lessons can also apply to AWS LambdaGoogle Cloud FunctionsAzure Functions, and IBM’s OpenWhisk project.

Arriving at a good definition of cloud IT security is difficult especially in the context of highly scalable distributed systems like those found in serverless platforms. The purpose of this post is to not to provide an exhaustive set of principles but instead highlight areas that developers, architects, and security officers might wish to consider when evaluating or setting up serverless platforms.

Serverless Processing — Similar But Different

High-scale task processing is certainly not a new concept in IT as it has parallels that date back to the days of job processing on mainframes. The abstraction layer provided by serverless process — in combination with large-scale cloud infrastructure and advanced container technologies — does, however, bring about capabilities that are markedly different than even just a few years ago.

By plugging into an serverless computing platforms, developers do not need to provision resources based on current or anticipated loads or put great effort into planning for new projects. Working and thinking at the task level means that developers are not paying for resources they are not using. Also, regardless of the number of projects in production or in development, developers using serverless processing do not have to worry about managing resources or provisioning systems.

While serving as Iron.io’s security officer, I answered a number of security questionnaires from customers. One common theme is that they were all in need of a serious update to bring them forward into this new world. Very few had any accommodation for cloud computing much less serverless processing.

Most questionnaires still viewed servers as persistent entities needing constant care and feeding. They presumed physical resources as opposed to virtualization, autoscaling, shared resources, and separation of concerns. Their questions lack differentiation between data centers and development and operation centers. A few still asked for the ability to physically inspect data centers which is, by and large, not really an option these days. And very few addressed APIs, logging, data persistence, or data retention.

The format of the sections below follows the order found in many of these security questionnaires as well as several cloud security policies. The order has been flipped a bit to start with areas where developers can have an impact. Later sections will address platform and system issues which teams will want to be aware of but are largely in the domain of serverless platforms and infrastructure providers.

Security Topics

Data Security
Categories: Architecture

Stuff The Internet Says On Scalability For March 3rd, 2017

Fri, 03/03/2017 - 17:56

Hey, it's HighScalability time:

 

Only 235 trillion miles away. Engage. (NASA)
If you like this sort of Stuff then please support me on Patreon.
  • $5 billion: Netflix spend on new content; $1 billion: Netflix spend on tech; 10%: bounced BBC users for every additional second page load; $3.5 billion: Priceline Group ad spend; 12.6 million: hours streamed by Pornhub per day; 1 billion: hours streamed by YouTube per day; 38,000 BC: auroch carving; 5%: decrease in US TV sets;

  • Quotable Quotes:
    • Fahim ul Haq: Rule 1: Reading High Scalability a night before your interview does not make you an expert in Distributed Systems.
    • @Pinboard: Root cause of outage: S3 is actually hosted on Google Cloud Storage, and today Google Cloud Storage migrated to AWS
    • Matthew Green: ransomware currently is using only a tiny fraction of the capabilities available to it. Secure execution technologies in particular represent a giant footgun just waiting to go off if manufacturers get things only a little bit wrong.
    • dsr_: This [S3 outage] is analogous to "we needed to fsck, and nobody realized how long that would take".
    • tptacek: Uber isn't the driver's employer. Uber is a vendor to the driver. The driver is complaining that its vendor made commitments, on which the driver depended, and then reneged. The driver might be right or might be wrong, but in no discussion with a vendor in the history of the Fortune 500 has it ever been OK for the vendor to accuse their customer of "not taking responsibility for their own shit".
    • @felixsalmon: Hours of video served per day: Facebook: 100 million Netflix: 116 million YouTube: 1 billion
    • @Geek_Manager: "Everybody wants to write reusable code. Nobody wants to reuse anyone else's code." @eryno #leaddev
    • @ellenhuet: a private South Bay high school 1) having a growth fund and 2) being early in Snap is the most silicon valley thing ever
    • @_ginger_kid: I speak from experience as a cash strapped startup CTO. Would love to be multi region, just cannot justify it. V hard.
    • @Objective_Neo: SpaceX, $12 billion valuation: Launches 70m rockets into space and lands them safely. Snapchat, $20 billion valuation: Rainbow Filters.
    • @neil_conway: (2/4): MTTR (repair time) is AT LEAST as important as MTBF in determining service uptime and often easier to improve.
    • John Hagel: we’re likely to see a new category of gig work emerge – let’s call it “creative opportunity targeting.”...we anticipate that more and more of the workforce will be pulled into this arena of creative gig workgroups
    • Seyi Fabode: The constraint is that the broker model, even with new technology, is not value additive. 
    • Robert Kolker: From his experience with the Gary police, Hargrove learned the first big lesson of data: If it’s bad news, not everyone wants to see the numbers
    • gamache: A piece of hard-earned advice: us-east-1 is the worst place to set up AWS services. You're signing up for the oldest hardware and the most frequent outages.
    • Dan Sperber: we each have a great many mental devices that contribute to our cognition. There are many subsystems. Not two, but dozens or hundreds or thousands of little mechanisms that are highly specialized and interact in our brain. Nobody doubts that something like this is the case with visual perception. I want to argue that it’s also the case for the so-called central systems, for reasoning, for inference in general.
    • Joaquin Quiñonero Candela: Facebook today cannot exist without AI. Every time you use Facebook or Instagram or Messenger, you may not realize it, but your experiences are being powered by AI.
    • alicebob: Sometimes keeping things simple is worth more than keeping things globally available.
    • Sveta Smirnova: Both MySQL and PostgreSQL on a machine with a large number of CPU cores hit CPU resources limits before disk speed can start affecting performance.
    • @jamesiry: Using many $100,000’s of compute, Google collided a known weak hash. Meanwhile one botched memcpy leaked the Internet’s passwords.
    • @david4096: teaching engineers to say no is cheaper than Haskell
    • @cgvendrell: #AI will be dictated by Google. They're 1 order of magnitude ahead, they understood key = chip level of stack (TPU) + training data @chamath
    • @antirez: There are tons of more tests to do, but the radix trees could replace most hash tables inside Redis in the future: faster & smaller.
    • DHH: So it remains mostly our fault. Our choice, our dollars. Every purchase a vote for an ever more dysfunctional future. We will spend our way into the abyss.
    • @jamesurquhart: This is why I write about data stream processing and serverless—lessons I learned at @SOASTAInc about the value of real time and BizOps.
    • twakefield: The brilliance of open sourcing Borg (aka Kubernetes) is evident in times like these. We[0] are seeing more and more SaaS companies abstract away their dependencies on AWS or any particular cloud provider with Kubernetes.
    • flak: password hashes aren’t broken by cryptanalysis. They’re rendered irrelevant by time (hardware advancements). What was once expensive is now cheap, what was once slow is now fast. The amount of work hasn’t been reduced, but the difficulty of performing it has.
    • @darkuncle: biz decisions again ... gotta weigh cost/frequency of AWS single-region downtime vs. cost/complexity of multi-region & GSLB.
    • @nantonius: Reducing network latencies are a key enabler for moving away from monolith towards serverless. @adrianco:
    • tbrowbdidnso: These companies that all run their own hardware exclusively are telling everyone that it's stupid to run your own hardware... Why are we listening?
    • jasonhoyt: "People make mistakes all the time...the problem was that our systems that were designed to recognize and correct human error failed us." 
    • @chuhnk: Bob: Service Discovery is a SPOF. You should build async services. Me: How do you receive messages? Bob: A message bus Me: ...
    • @JoeEmison: These articles on serverless remind me of articles on NoSQL from a few years ago. FaaS may have low adoption b/c of the req'd architectures.
    • @Jason: We have 30-60% open rates for http://inside.com  emails vs 1% for app downloads!
    • @adrianco: Split brain syndrome: half your brain thinks message buses are reliable. Other half is wondering how to recover from split brain syndrome.
    • @dbrady: The older I get, the less I care about making tech decisions right and the more I care about retaining the ability to change a wrong one.
    • @littleidea: "Automation code, like unit test code, dies when the maintaining team isn’t obsessive about keeping the code in sync with the codebase."
    • @adulau: I don't ask for bug bounties, fame, cash or even tshirt. I just want a good security point of contact to fix the issues.
    • StorageMojo: most of the SSD vendors don’t make AFAs [all flash arrays]. They have little to lose by pushing NVMe/PCIe SSDs for broad adoption.
    • cookiecaper: I mean, that's not really AWS's problem, is it? Outages happen. If you have a mission-critical service like health care, you really shouldn't write systems with single points of failure like this, especially not systems that depend on something consumer-grade like S3.
    • plgeek: To me his main point is there is a spectrum of what you might consider evidence/proof. However, in Software Engineering their have been low standards set, and it's really not acceptable to continue with low standards. He is not saying the only sort of acceptable evidence is a double blind study.
    • n00b101: I asked an Intel chip designer about this and his opinion was that asynchronous processors are a "fantasy." His reasoning was that an asynchronous chip would still need to synchronize data communication within the chip. Apparently global clock synchronization accounts for about 20% of the power usage of a synchronous chip. In the asynchronous case, if you had to synchronize every communication, then the cost of communication is doubled.

  • Anti-virus software uses fingerprinting as a detection technique. Surprise, nature got there first. Update: CRISPR. Bacteria grab pieces of DNA from viri and store them. This lets them recognize a virus later. When a virus enters a bacteria the bacteria will send out enzymes to combat the invader. Usually the bacteria dies. Sometimes the bacteria wins. The bacteria sends out enzymes to find stray viruses and cut the enemy DNA into small pieces. Those enzymes take the little bits of DNA and splice them into the bacteria's own DNA. DNA is used as a memory device. Next time the virus shows up the bacteria creates molecular assassins that contain a copy of the virus DNA. If there's a pattern match then kill it. The protein looks something like a clam shell. It has a copy of the virus DNA. Whenever it bumps into some virus DNA it pulls apart the DNA, unzips it, reads it, if it's not the right one it moves on. If the RNA has the same sequence then molecular blades come out and chop. Like smart scissors. This is CRISPR.

  • Videos from microXchg 2017 are now available

  • A natural disaster occurred. S3 went down. Were you happy with how your infrastructure responded? @spire was. Mitigating an AWS Instance Failure with the Magic of Kubernetes: "Kubernetes immediately detected what was happening. It created replacement pods on other instances we had running in different availability zones, bringing them back into service as they became available. All of this happened automatically and without any service disruption, there was zero downtime. If we hadn’t been paying attention, we likely wouldn’t have noticed everything Kubernetes did behind the scenes to keep our systems up and running." How do you make this happen?: Distribute nodes across multiple AZs; Nodes should have capacity to handle at least one node failure; Use at least 2 pods per deployment; Use readiness and liveness probes.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Getting Started with Lyft Envoy for Microservices Resilience

Wed, 03/01/2017 - 17:57

This is a guest repost by Flynn at datawireio on Envoy, a Layer 7 communications bus, used throughout Lyft's service-oriented architecture.

Using microservices to solve real-world problems always involves more than simply writing the code. You need to test your services. You need to figure out how to do continuous deployment. You need to work out clean, elegant, resilient ways for them to talk to each other.

A really interesting tool that can help with the “talk to each other” bit is Lyft’s Envoy: “an open source edge and service proxy, from the developers at Lyft.” (If you’re interested in more details about Envoy, Matt Klein gave a great talk at the 2017 Microservices Practitioner Summit.)

Envoy Overview

It might feel odd to see us call out something that identifies itself as a proxy – after all, there are a ton of proxies out there, and the 800-pound gorillas are NGINX and HAProxy, right? Here’s some of what’s interesting about Envoy:

  • It can proxy any TCP protocol.
  • It can do SSL. Either direction.
  • It makes HTTP/2 a first class citizen, and can translate between HTTP/2 and HTTP/1.1 (either direction).
  • It has good flexibility around discovery and load balancing.
  • It’s meant to increase visibility into your system.
    • In particular, Envoy can generate a lot of traffic statistics and such that can otherwise be hard to get.
    • In some cases (like MongoDB and Amazon RDS) Envoy actually knows how to look into the wire protocol and do transparent monitoring.
  • It’s less of a nightmare to set up than some others.
  • It’s a sidecar process, so it’s completely agnostic to your services’ implementation language(s).

(Envoy is also extensible in some fairly sophisticated — and complex — ways, but we’ll dig into that later — possibly much later. For now we’re going to keep it simple.)

Being able to proxy any TCP protocol, including using SSL, is a pretty big deal. Want to proxy Websockets? Postgres? Raw TCP? Go for it. Also note that Envoy can both accept and originate SSL connections, which can be handy at times: you can let Envoy do client certificate validation, but still have an SSL connection to your service from Envoy.

Of course, HAProxy can do arbitrary TCP and SSL too — but all it can do with HTTP/2 is forward the whole stream to a single backend server that supports it. NGINX can’t do arbitrary protocols (although to be fair, Envoy can’t do e.g. FastCGI, because Envoy isn’t a web server). Neither open-source NGINX nor HAProxy handle service discovery very well (though NGINX Plus has some options here). And neither has quite the same stats support that a properly-configured Envoy does.

Overall, what we’re finding is that Envoy is looking promising for being able to support many of our needs with just a single piece of software, rather than needing to mix and match things.

Envoy Architecture
Categories: Architecture

Sponsored Post: Aerospike, Loupe, Clubhouse, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Tue, 02/28/2017 - 17:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • DBTA Roundtable Webinar: Fast Data: The Key Ingredients to Real-Time Success. Thursday February 23, 2017 | 11:00 AM Pacific Time. Join Stephen Faig, Research Director Unisphere Research and DBTA, as he hosts a roundtable discussion covering new technologies that are coming to the forefront to facilitate real-time analytics, including in-memory platforms, self-service BI tools and all-flash storage arrays. Brian Bulkowski, CTO and Co-Founder of Aerospike, will be speaking along with presenters from Attunity and Hazelcast. Learn more and register.

  • Your event here!
Cool Products and Services
  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: Architecture

Business Case for Serverless

Mon, 02/27/2017 - 17:56

You can’t pick a technical direction without considering the business implications. Mat Ellis, Founder/CEO of Cloudability, in a recent CloudCast episode, makes the business case for Serverless. The argument goes something like:

  • Enterprises know they can’t run services cheaper than Amazon. Even if the cost is 2x the extra agility of the cloud is often worth the multiple.

  • So enterprises are moving to the cloud.

  • Moving to the cloud is a move to services. How do you build services now? Using Serverless.

  • With services businesses use a familiar cost per unit billing model, they can think of paying for services as a cost per database query, cost per terabyte of data, and so on.

  • Since employees are no longer managing boxes and infrastructure they can now focus entirely on business goals.

  • There’s now an opportunity to change business models. Serverless will make new businesses economically viable because they can do things they could never do before based on price and capabilities.

  • Serverless makes it faster to iterate and deploy new code which makes it faster to find a proper product/market fit.

  • Smaller teams with smaller budgets with smaller revenues can do things now that only big companies could do before. Serverless attempts to industrialise developer impact.

  • Consider WhatsApp, which sold to Facebook for $19 billion with only 55 employees. If we’re going to see the first single employee billion user multi-billion dollar valuation startup it will likely be built on Serverless.

Categories: Architecture

Stuff The Internet Says On Scalability For February 24th, 2017

Fri, 02/24/2017 - 17:56

Hey, it's HighScalability time:

 

Great example of Latency As A Pseudo-Permanent Network Partition. A slide effectively cleaved Santa Cruz from the North Bay by slowing traffic to a crawl.
If you like this sort of Stuff then please support me on Patreon.
  • 40 TFLOPS: on Lambda; 7: new habitable planets with good beer; dozens: balloons needed in Loon network; 500 TB/sec: rate at which DNA is copied in human body; 1/2: web is encrypted; 34: regions in Azure; $8k: cost of Tesla self-driving hardware; 99.95%: DMCA takedowns are bot BS; 300 nanometers: new microscope; 7%: AMP traffic to publishers; 

  • Quotable Quotes:
    • @jasonlk: Elon Musk: Self-Driving Car Revolution Will Leave 15% of World Population Without Jobs
    • Near death Archimedes: Stand away, fellow, from my diagram!
    • rumpelstilskin21: Angular and React make for popular headlines on reddit but unless you are working for a major, large web site where such things might be deemed useful by management (and no one else) then quit trying to get educated by the amateurs on reddit.
    • StorageMojo: There is a new paradigm about to hit the industry, which will eviscerate large portions of the current storage ecosystem. Like other major shifts, it is powered by a class of users who are poorly served by existing products and technologies. But if our digital civilization is to survive and prosper, it has to happen. And it will, like it or not.
    • ThatMightBePaul: Worst case scenario: you try Go, don't like it, and you head back to Node more confident that it fits you better. That's still a pretty positive outcome, imo. So, invest the time in Go, and then see which feels right :)
    • Russ: it is the job of the application to properly figure out the network’s limits and try to live within them.
    • World's Second-Best Go Player: After humanity spent thousands of years improving our tactics, computers tell us that humans are completely wrong. I would go as far as to say not a single human has touched the edge of the truth of Go.
    • @mjpt777: After fixing a few more false sharing issues we shaved another ~350ns of Aeron's RTT between machines.
    • @thomasfuchs: 1997: Let’s make a website! *fires up vi* 2007: Let’s make a website! *downloads jQuery* *fires up vi* 2017: Let’s make a website! [very long list of tech]
    • Basho: Do not follow the ancient masters, seek what they sought.
    • hellofunk: If many years ago, someone told me that a humongous company named Alphabet was thinking about deploying balloons all over the world, I'd have told you a thing or two about having a charming imagination. 
    • Russ: Sure, the Internet is broken. But anything we invent will, ultimately, be broken in some way or another. Sure the IETF is broken, and so is open source, and so is… whatever we might invent next. We don’t need a new Internet, we need a little less ego, a lot less mud slinging, and a lot more communication. 
    • @sAbakumoff: Analyzed the sentiment of 80000 Github Commit Comments, it seems that Ruby devs tend to be pretty positive, but c++ are angriest ones!
    • Michael Sawyer: The YouTubers' common enemy is YouTube
    • @jannis_r: "Good size for a microservice: if it fits into one engineers head" @adrianco #AWSTechBreakfast
    • packagecloud: setting [TZ] environment variable can save thousands (or in some cases, tens of thousands) of unnecessary system calls that can be generated by glibc over small periods of time. 
    • @istanboolean: "Hardware has stopped getting faster. Software has not stopped getting slower." @rob_pike
    • Greg Meddles: You're out of memory on some particular Amazon instance, so you bump up to the next biggest in size. That is always the naive solution. Whatever you're doing, you'll usually end up doing more of it. Eventually, you'll end up throwing good money after bad.
    • @viktorklang: Replace the use of sequential, concurrent, and parallel with dependent, coordinated, and independent? Thoughts?
    • Coast Guard Vice Adm. Marshall Lytle: Cyberwarfare is like a soccer game with all the fans on the field with you and no one is wearing uniforms
    • CockroachDB: If you’re serious about building a company around open source software, you must walk a narrow path: introduce paid features too soon, and risk curtailing adoption. Introduce paid features too late, and risk encouraging economic free riders. Stray too far in either direction, and your efforts will ultimately continue only as unpaid open source contribution
    • Veratyr: Deployment [of k8s] is just so much harder than it should be. Fundamentally (I discovered far later on in the process), Kubernetes is comprised of roughly the following services: kube-apiserver, kubelet, kube-proxy, kube-scheduler, kube-controller-manager. The other dependencies are: A CA infrastructure for certificate based authentication, etcd, a container runtime (rkt or Docker) and CNI.
    • @jbeda: I want to go on record: the amount of yaml required to do anything in k8s is a tragedy. Something we need to solve. 

  • What do you get for $5? Quite a lot. $5 Showdown: Linode vs. DigitalOcean vs. Amazon Lightsail vs. Vultr: Linode’s new plan is not only offering the consistently better performance...Linode is still a bit behind the curve when it comes to things like block storage volumes, default SSH keys and yeah, their UI.

  • Another wonderful engineering post from Riot Games. Under the hood of the League Client's Hextech UI: Any given build of the League client is expressed as a list of units called plugins... Back-end plugins that deal purely with data are written as C++ REST microservices...front-end plugins that deal with presentation are written as Javascript client applications and run inside Chromium Embedded Framework...The League client update really is a desktop deployment of an entire constellation of microservices...APIs are thoughtfully designed, any arbitrary combination of features can run cooperatively...In the League client, the common pattern is for dependencies to flow upwards...a WebSocket that allows the front-end plugins to observe back-end plugins for changes...To make implementation of complex video-based elements simpler, we created a state machine library based on Web Components...League client is patched out to players’ local drives, it doesn’t have the same immediate bandwidth constraints...we provide a number of purpose-specific audio channels - UI SFX, Notifications, Music, Voiceover, etc. - through a plugin dedicated to managing audio...We use straight-up native Custom Elements with heavy usage of Shadow DOM.

  • Does insurance cover this? The first SHA1 collision.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Scaling @ HelloFresh: API Gateway

Mon, 02/20/2017 - 17:56

HelloFresh keeps growing every single day: our product is always improving, new ideas are popping up from everywhere, our supply chain is being completely automated. All of this is simply amazing us, but of course this constant growth brings many technical challenges.

Today I’d like to take you on a small journey that we went through to accomplish a big migration in our infrastructure that would allow us to move forward in a faster, more dynamic, and more secure way.

The Challenge

We’ve recently built an API Gateway, and now we had the complex challenge of moving our main (monolithic) API behind it — ideally without downtime. This would enable us to create more microservices and easily hook them into our infrastructure without much effort.

The Architecture
Categories: Architecture

Stuff The Internet Says On Scalability For February 17th, 2017

Sun, 02/19/2017 - 19:21

Hey, it's HighScalability time:

 

Gorgeous satellite images of a thawing Greenland (NASA).
If you like this sort of Stuff then please support me on Patreon.
  • 1 cubic millimeter: computer with deep-Learning; 1,600: data on nearby stars; 40M: users for largest Parse app; 58x: Tensorflow 1.0 speedup on 64 gpus; 46%: ecommerce controlled by Amazon; 60%: IT growth in public cloud; 200 TB: one tv episode; 

  • Quotable Quotes:
    • @krishnan: Serverless will not be around in 5 years. It will be AI coding AI coding Ai....... Serverless or not doesn't matter #RunForrestRun
    • user5994461: Amazon: Create usual services and sell them. Google: Make unique products that push the boundaries of what was previously thought possible. Amazon: Don't care about inefficiencies and usage. Inefficiencies can be handled by charging more to the clients, usage doesn't matter because the users are mostly the clients and they don't feel their pain. Google: Had to make all their core technologies efficient, performant, scalable and maintainable or they couldn't sustain their business.
    • Hans Rosling: To me, the impressive thing is that people succeed at all.
    • @littleidea: Google Spanner didn't beat CAP, just mitigated the hell out of P
    • @jordw: Cloud Spanner is a very well-engineered CP database that is also very good at being available.
    • Cade Metz: The AI Threat Isn’t Skynet. It’s the End of the Middle Class
    • hosh: Four years ago, I determined that while development work might seem to be near the top of the food chain, there will at some point where my work will be replaced by AIs.
    • mi100hael:  I found Go's "simplicity" to be limiting and frustrating when it came to building production applications. Things like the weird split between functions returning errors but occasionally panicking, lack of inheritance, and poor dependency management through github links make Go a poor choice for applications within a business setting. 
    • @NathanTippy: New #Java web server clearing 1 million HTTP requests per second on 4 core box.  Can run in < 100MB of memory.
    • @kellabyte: It doesn’t matter what the founder or developer of a database tells you. It’s about the true peopeties it guarantees.
    • @swardley: Private cloud starting to drop, public cloud a three horse race - AWS 1st, MSFT 2nd, GooG 3rd ... sensible stuff 
    • @ollekullberg: Kullberg's law: when we increase the size of a microservice we increase the benefit of static typing for this microservice.
    • @swardley: ... it's not lack of engineering capability or finance or market or marketing or branding, the real story of cloud is executive failure.
    • katied: Trophic cascade is a process that starts at the top of a food chain and works its way to the bottom of it. So, even though as predators wolves survive by taking life, they also have the ability to create it.
    • @swardley: Cloud wars in IaaS - oh, please. War was well over in 2012, yes there will be price cuts as constraints are reduced but there is no battle.
    • @HenryR: 1. CAP has always said only one thing: that there is always a particular network failure that forces you to give up either C or A. 2. It has nothing at all to do with how likely that failure mode is. The failure is system-specific. 
    • throwawaydbfif: The movement from ownership to renting on the web is absolutely terrifying to me. Within the span of a few years we've gone from owning our technology to renting it out from a big players for monthly fees that we cannot completely predict or control.
    • computerex: People use cloud computing because it already is massively impractical to run your own servers. Hardware is hard to run and scale on your own and experiences economies of scale. This principle is seen everywhere and can hardly be viewed as something controversial. 
    • stuckagain: You did not ever own your own globally consistent, massively scalable, replicated database. The fact that you can now rent one by the hour is strictly an improvement for you, if you need that kind of thing
    • tedd4u: Aurora is very cool but won't help you much after you vertically scale your master and still need more write capacity. With Cloud Spanner you get horizontal write scalability out of the box. Critical difference.
    • @koivimik: REST != CRUD via HTTP #microXchg @olivergierke
    • Linus: It's almost boring how well our process works. All the really stressful times for me have been about process. They haven't been about code. When code doesn't work, that can actually be exciting ... Process problems are a pain in the ass. You never, ever want to have process problems ... That's when people start getting really angry at each other.
    • @littleidea: Almost every task run under Borg contains a built-in HTTP server that publishes information about the health of the task...
    • W. Daniel Hillis: For Richard [Feynman], figuring out these problems was a kind of a game. He always started by asking very basic questions like, “What is the simplest example?” or “How can you tell if the answer is right?” He asked questions until he reduced the problem to some essential puzzle that he thought he would be able to solve.
    • @ewolff: "Every hackathon uses Lambda. They build really complicated, production-ready systems in 12h" @adrianco at @microXchg
    • Daniel Bryant: The term "microservices" itself will probably disappear in the future, but the new architectural style of functional decomposition is here to stay.
    • @rbranson: The NoSQL movement might be a disappointment, but emerging from the rubble is the log-based (i.e. Kafka) model that actually works.
    • Chip Overclock: Surprisingly, GPS satellites actually know nothing about position. What they know about is time.
    • @codinghorror: I look at my old blog posts and think... there was a time when I believed 24GB was a lot of RAM
    • vidarh: Depending on your workloads, DO servers can come out cheaper or more expensive than AWS, but bandwidth at DO is so much cheaper than AWS that for bandwidth intensive stuff I can't serve entirely out of Europe (where Hetzner is vastly cheaper than DO again), DO is often a much cheaper alternative. Sometimes we use it as a cost-cutting do-it-yourself CDN in front of AWS for clients that insist on S3 for storage (and again where we can't just cache everything in Europe for latency reasons). For bandwidth heavy applications, you can pay for significant numbers of Droplets from the AWS bandwidth savings alone.
    • lobster_johnson: we use Google Container Engine (hosted Kubernetes), with Salt for the non-GKE VMs. This is needed because K8s is not mature enough to host all the things. In particular, stateful sets are still in beta. 
    • anonymous: The overall impact [algorithms] will be utopia or the end of the human race; there is no middle ground foreseeable. I suspect utopia given that we have survived at least one existential crisis (nuclear) in the past and that our track record toward peace, although slow, is solid.
    • keenio: In conclusion, the TCO is probably significantly lower for Kinesis. So is the risk. And in most projects, risk-adjusted TCO should be the final arbiter.
    • Adem Efe Gencer: the weekly [Bitcoin] mining power of a single miner has never exceeded the 30% of the overall mining power in 2016. Morever, in the second half of the year, the highest mining power has consistently been under the 20% range.
    • David Rosenthal: The security downside of Postel's Law is even more fundamental. The law requires the receiver to accept, and do something sensible with, malformed input. Doing something sensible will almost certainly provide an attacker with the opportunity to make the receiver do something bad.
    • douche: That's pretty much the way it has always been. You can go back at least to the Civil War and find politics has had more to do with procurement than performance of the weapon systems in question.
    • Jonathan Suen: While the brain and the Internet clearly operate using very different mechanisms, both use simple local rules that give rise to global stability. I was initially surprised that biological neural networks utilized the same algorithms as their engineered counterparts, but, as we learned, the requirements for efficiency, robustness, and simplicity are common to both living organisms and the networks we have built.
    • Bruce Johnson: Code reviews set the tone for the entire company that everything we do should be open to scrutiny from others, and that such scrutiny should be a welcome part of your workflow rather than viewed as threatening.
    • codingmyway: I think some miners are against any increase because it will lower fees. Without a blocksize limit fees tend to zero, which is fine while there is the block reward but they still want to milk the congestion fees. To say they are pro segwit or pro unlimited is bluffing. They are pro status quo and congestion and high fees.
    • edejong: Many engineers I have worked with like to throw around terms like: "CQRS", "Event sourcing", "no schema's", "document-based storage", "denormalize everything" and more. However, when pushed, I often see that they lack a basic understanding of DBMSes, and fill up this gap by basically running away from it. For 95% of the jobs, a simple, non-replicated (but backed-up) DBMS will do just fine.
    • adamu__: If China were to shut down bitcoin mining, my understanding is that the worst case scenario is much more dire. The network only adjusts the 'difficulty' relative to current network hash power every 2,016 blocks. Depending on the severity of the overall hash power reduction, new block discovery might slow down significantly. This would also delay a recalculation of the new difficulty accommodating the reduction in hash power. The network could be severely throttled for weeks.
    • boulos: Slightly off-topic, but EC2 doesn't really scale independently if you compare it to GCE. We let you combine 24 vcpus with 39 GB of RAM, 3 partitions of Local SSD and a few GPUs, all independently (though the ratio of RAM to vcpu is currently bounded between .9 and 6.5).
    • Veratyr: Personally, I settled with colocation. I pay $60/mo + $2k one-off for the initial hardware + say $150/5y/4TB HDD, which, for 80TB of storage over 5y comes out to a total of ~$88/mo, or $0.001/GBmo. 

  • Now this is object oriented programming. New software for increasingly flexible factory processes: new software that allows each individual component to tell the machine what has to be done. By breaking away from central production planning, factories can achieve unprecedented agility and flexibility... Everything would go much faster if production and the requisite machines were not rigidly set by a control program, but if every component itself knew the best way for it to be moved quickly through the process chain. 

  • Relax. Videos from TensorFlow Dev Summit 2017 are now available. Also, Learn TensorFlow and deep learning, without a Ph.D. Also also, Deep Learning book.

  • Google is Introducing Cloud Spanner: a global database service for mission-critical applications. It will be interesting to see if Spanner, as a unique hard to duplicate feature, becomes a Google Cloud differentiator. Will it make the delta between the clouds significant enough that developers choose Google? Quizlet, already running on GCP, really likes Spanner, but it's not a drop in replacement for MySQL. Like with NoSQL there's special care and feeding to make it work, but that's the sacrifice high QPS requires. Performance: "Cloud Spanner queries have higher latency at low throughputs compared with a virtual machine running MySQL. Spanner's scalability, however, means that a high-capacity cluster can easily handle workloads that stretch our MySQL infrastructure." And p90s are consistently lower than 50 ms. Cost: "For very small or low-throughput databases Cloud Spanner is overkill [min ~$8,000/yr]...Cloud Spanner comparable or slightly cheaper based on the performance in our testing."  With Spanner hitting the market maybe that will help CockroachDB? Some older articles: Spanner - It's About Programmers Building Apps Using SQL Semantics At NoSQL ScaleGoogle Spanner's Most Surprising Revelation: NoSQL Is Out And NewSQL Is InF1 And Spanner Holistically ComparedHow Google Invented An Amazing Datacenter Network Only They Could Create

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Sponsored Post: Aerospike, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Tue, 02/14/2017 - 17:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • DBTA Roundtable Webinar: Fast Data: The Key Ingredients to Real-Time Success. Thursday February 23, 2017 | 11:00 AM Pacific Time. Join Stephen Faig, Research Director Unisphere Research and DBTA, as he hosts a roundtable discussion covering new technologies that are coming to the forefront to facilitate real-time analytics, including in-memory platforms, self-service BI tools and all-flash storage arrays. Brian Bulkowski, CTO and Co-Founder of Aerospike, will be speaking along with presenters from Attunity and Hazelcast. Learn more and register.

  • Your event here!
Cool Products and Services
  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: Architecture

Part 3 of Thinking Serverless —  Dealing with Data and Workflow Issues

Mon, 02/13/2017 - 17:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1 and 2

This post is the third of a four-part series of that will dive into developing applications in a serverless way. These insights are derived from several years working with hundreds of developers while they built and operated serverless applications and functions. The platform was the serverless platform from Iron.io but these lessons can also apply to AWS LambdaGoogle Cloud FunctionsAzure Functions, and IBM’s OpenWhisk project.

Serverless Processing — Data Diagram

Thinking Serverless! The Data
Categories: Architecture

Stuff The Internet Says On Scalability For February 10th, 2017

Fri, 02/10/2017 - 17:56

Hey, it's HighScalability time:

 

It was a game of drones.
If you like this sort of Stuff then please support me on Patreon.
  • Half a trillion: Apple’s cash machine; 4,000-5,000: collected data points per adult in US; 10 million: gallons of gas UPS saves turning right; 2.27: Tesla 0-60 time; 40: complex steps to phone security; $2.3 billion: VR/AR investment in 2016; 18%: small players make up public cloud services market; 500°C: first chip to survive on Venus; 5 billion: ever notes; 375,000: images from The Metropolitan Museum of Art in public domain; 18 million: queries per minute against Facebook's Beringei database; 159: jobs per immigrant founder; 2.5 miles: whales breach for stronger signal; 10,000x: computers faster in 2035; 

  • Quotable Quotes: 
    • @martin_casado: Chinese factory replaces 90% of human workers with robots. Production rises by 250%, defects drop by 80%
    • Jure Leskovec: It’s [trolling] a spiral of negativity. Just one person waking up cranky can create a spark and, because of discussion context and voting, these sparks can spiral out into cascades of bad behavior. Bad conversations lead to bad conversations. People who get down-voted come back more, comment more and comment even worse.
    • sudhirj: The first concrete thing I learnt is this - implement pull first, it works 100% of the time, but may be inefficient with regards to time. Then implement push, it works 99% of the time but is much faster. But always have both running.
    • Tom Randall: California’s goal is considerable, but it’s dwarfed by Tesla’s ambition to single-handedly deliver 15 gigawatt hours 1 of battery storage a year by the 2020s—enough to provide several nuclear power plants–worth of electricity to the grid during peak hours of demand
    • @aphyr: Like I can't show that it's 100% correct, but so far I haven't found a way to break 3.4.0. Opens up a bunch of new use cases for MongoDB.
    • Azethoth666: The coming fast non-volatile memory architectures will be interesting. Everything will be in memory, but it will not go away. The infection cycle will have to clean up after itself or remain in the super fast volatile memory parts.
    • StorageMojo: In five years the specter of AWS cloud dominance will be a distant memory. The potential cloud market is enormous and we are, in effect, where the computer industry was in 1965. AWS will be successful, just not dominant. No tears for AWS.
    • @johnrobb: ~ 'Bots make public conversation a synthetic conversation. This makes it very difficult to know what consensus looks like.
    • W. Daniel Hillis: One day when I was having lunch with Richard Feynman, I mentioned to him that I was planning to start a company to build a parallel computer with a million processors. His reaction was unequivocal, “That is positively the dopiest idea I ever heard.”
    • @supershabam: Every database is a message bus if you try hard enough
    • mlechha: Boltzmann machines are a stochastic version of the Hopfield network. The training algorithm simply tries to minimize the KL divergence between the network activity and real data. So it was quite surprising when it turned out that the algorithm needed a "dream phase" as they call it. Francis Crick was inspired by this and proposed a theory of sleep.
    • @benjammingh: OH "Docker is Latin for a fire consisting predominantly of tires
    • UweSchmidt: "Real" bitcoining doesn't use services like coinbase; the coins are on your computer which you have to secure yourself. At least this is what you get told in cryptocurrency forums when one of the exchanges get hacked.
    • @axleyjc: 'Think of your System as a "Set of annotated request trees"' to manage microservice complexity @adrianco @ExpediaEng
    • @happy_roman: VW CEO on Tesla: "We'll win in the end, because of our abilities to scale & spread production."
    • aaron bell: Whichever cloud provider you pick based on your needs and their specific offering, I beg of you — please don’t try hybrid
    • zebra9978: Kubernetes introduces a lot of upfront complexity with little benefit sometimes. For example, kargo is failing with Flannel, but works with Calico (and so on and so forth). Bare metal deployments with kubernetes are a big pain because the load balancer setups have not been built for it - most kubernetes configs depend on cloud based load balancers (like ELB). In fact, the code for bare metal load balancer integration has not been fully written for kubernetes.
    • a13n: This is huge. 87-99% shared code between iOS and Android. Someday companies as big as Instagram won't need to have entire separate product teams for separate platforms.
    • David Rosenthal: I've always said that the chief threat to digital preservation is economic; digital information being very vulnerable to interruptions in the money supply. 
    • YZF: There are no channels [in C++], there are no lightweight/green threads, there's no standard HTTP library, no standard crypto libraries, no standard test framework. For certain classes of applications this makes Go significantly more productive and significantly less bug/error prone. Not to mention compile times.
    • jdwyah: Kinesis firehose to S3 and then query with Athena is pretty great. I've been very happy with the combo.
    • mcherm: Your example from RethinkDB really struck home to me. The idea that superior technology might lose out due to poor marketing or (in this case) a system that is optimized for the real world rather than being optimized for benchmarks really disturbs me.
    • Aras PranckeviÄŤius: Moral of the story is: code that used to do something with five things years ago might turn out to be problematic when it has to deal with a hundred. And then a thousand. And a million. And a hundred billion. Kinda obvious, isn’t it?
    • kordless: I've come to a hypothesis that technology's purpose is to gently erode the concept of "self"
    • Microsoft: Close to a year ago we reset and focused on how we would actually get Git to scale to a single repo that could hold the entire Windows codebase (include estimates of growth and history) and support all the developers and build machines.
    • XorNot: I've run extensive benchmarks of Hadoop/HBase in Docker containers, and there is no performance difference. There is no stability difference (oh a node might crash? Welcome to thing which happens every day across a 300 machine cluster). Any clustered database setup should recover from failed nodes. Any regular relational database should be pretty close to automated failover with replicated backups and an alert email. Containerization doesn't make this better or worse, but it helps a lot with testing and deployment.
    • Dan Luu: When I was at Google, someone told me a story about a time that “they” completed a big optimization push only to find that measured page load times increased. When they dug into the data, they found that the reason load times had increased was that they got a lot more traffic from Africa after doing the optimizations. The team’s product went from being unusable for people with slow connections to usable, which caused so many users with slow connections to start using the product that load times actually increased.

  • There's a quintessential Silicon Valley moment in The Founder, a movie about the more interesting than expected McDonald's origin story. Brothers Mac and Dick McDonald kicked around from startup to startup. Nothing stuck. Drive-ins ruled the day, but were ripe for disruption. They were slow, used lots of servers, had too many options, attracted the wrong user base, and often delivered the wrong results. Metrics told them users mostly bought burgers, fries, and milkshakes. So the brothers decided to completely rethink how burgers were made and sold. What they came up with disrupted the food industry: a serverless drive-in based on a new low latency pipeline for making burgers called the Speedy System. An order was delivered within 30 seconds of being made; metrics helped control the latency distribution. Here's a short vignette showing how it was done. You'll love it. They traced the exact dimensions of the kitchen and conducted numerous simulations to figure out the optimal configuration. Users walk up to the window to order, so no servers. The API was narrow, only a few items could be ordered. Waste was reduced because utensils were done away with using innovative packaging design. Automation and a proprietary tool chain delivered a consistent product experience. And as often happens in Silicon Valley the founders were out maneuvered. While the McDonald brothers innovated the tech, Ray Kroc innovated the business model. Guess who ended up with everything? 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

In-memory noSQL DBMS Client in Big Data Cluster

Wed, 02/08/2017 - 17:56

This is guest post by Sergei Sheinin, creator of the 2DX Web UI Database Cluster Framework, a low latency big data cluster with in-memory noSQL DBMS Web Browser client.

When I began working in the field of data management the disconnect between rigid structure of relational database tables and free form of documents managed by end users and their businesses stood out as a technical and managerial hurdle. On the one hand there were strict definitions of normalized relational database models and unstructured document formats on the other. Often the users in charge of changing document structures held organizational responsibilities far removed from database modeling or programming. On one occasion I was involved in a project where call center operators made on the fly decisions to update a document structure based on phone conversations with customers. Such updates had to be streamed into a relational back-end creating havoc in database structure and build of table columns.

In seeking a permanent solution I researched merits of Entity-Attribute-Value database schema and its applications. This technique proved successful in enabling front end users to modify relational-bound documents through performing updates to structure described in their metadata. However application of EAV raised its own issues, for example accommodation of updated document metadata at times required changes to definitions of the relational tables, attention of developers due to complexity of application layer in client-server interoperability, rapidly growing fact tables and performance of multiple join statements in select queries...

Categories: Architecture

Part 2 of Thinking Serverless —  Platform Level Issues 

Mon, 02/06/2017 - 17:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1.

Job processing at scale at high concurrency across a distributed infrastructure is a complicated feat. There are many components involvement — servers and controllers to process and monitor jobs, controllers to autoscale and manage servers, controllers to distribute jobs across the set of servers, queues to buffer jobs, and whole host of other components to ensure jobs complete and/or are retried, and other critical tasks that help maintain high service levels. This section peels back the layers a bit to provide insight into important aspects within the workings of a serverless platform.

Throughput

Throughput has always been the coin of the realm in computer processing — how quickly can events, requests, and workloads be processed. In the context of a serverless architecture, I’ll break throughput down further when discussing both latency and concurrency. At the base level, however, a serverless architecture does provide a more beneficial architecture than legacy applications and large web apps when it comes to throughput because it provide for far better resource utilization.

In a post by Travis Reeder on What is Serverless Computing and Why is it Important he addresses this topic.

Cost and optimal use of resources is a huge reason to do serverless. If you are a big company with a bunch of apps/APIs/microservices, you are currently running those things 24/7 and they are using resources 100% of the time, no matter if they are in use or not. With a FaaS infrastructure, instead of running apps 24/7, you can execute functions for any number of apps on demand and share all the same resources. Theoretically, you could reduce waste (idle time) to almost nothing while still providing fast response time. For a FaaS provider, this cost savings is passed up to the end user, the developer. For an enterprise, this can reduce capex and opex big time.

Another way of looking at it is that by moving to more discrete tasks that can run in universal platform with self-contained dependencies, tasks can run anytime anywhere across a serverless architecture. This is in contrast to a set of stand alone monolithic applications whereby operations teams have to spend significant cycles arbitrating which applications to scale, when, and how. (A serverless architecture can also increase throughput of application and feature development but much has been said in this regard as it relates to microservices and functions as a service.)

A Graph of Tasks and Projects

The graph below shows a set of tasks over time for a single account on the a serverless platform. The overarching yellow line indicates all tasks for an account and the other lines represent projects within the account. The project lines should be viewed as a microservice or a specific set of application functions. A few years ago, the total set would have been built as a traditional web application and hosted as a long-running application. As you can see, however, each service or set of functions has a different workload characteristic. Managing the aggregated set at an application level is far more complex than managing at the task level within a serverless platform, not to mention the resource savings by scaling commodity task servers as opposed to much more complex application servers.

All Tasks (Application View) vs Specific Tasks (Serverless View)

Latency
Categories: Architecture

Stuff The Internet Says On Scalability For February 3rd, 2017

Fri, 02/03/2017 - 17:56

Hey, it's HighScalability time:


 

We live in interesting times. F/A-18 Super Hornets Launch drone swarm.
If you like this sort of Stuff then please support me on Patreon.
  • 100 billion: words needed to train large networks; 73,653: hard drives at Backblaze; 300 GB hour: raw 4k footage; 1993: server running without rebooting; 64%: of money bet is on the Patriots; 950,000: insect species; 374,000: people employed by solar energy; 10: SpaceX launched Iridium Next satellites; $1 billion: Pokémon Go revenue; 1.2 Billion: daily active Facebook users; $7.17 billion: Apple service revenue; 45%: invest in private cloud this year; 

  • Quoteable Quotes:
    • @kevinmarks: #msvsummit @varungyan: Google's scale is about 10^10 RPCs per second in our microservices
    • language: "Order and chaos are not a properties of things, but relations of an observer to something observed - the ability for an observer to distinguish or specify pattern."
    • general_ai: Doing anything large on a machine without CUDA is a fool's errand these days. Get a GTX1080 or if you're not budget constrained, get a Pascal-based Titan. I work in this field, and I would not be able to do my job without GPUs -- as simple as that. You get 5-10x speedup right off the bat, sometimes more. A very good return on $600, if you ask me.
    • Al-Khwarizmi: Maybe I'm just not good at it and I'm a bit bitter, but my feeling is that this DL [deep learning] revolution is turning research in my area from a battle of brain power and ingenuity to a battle of GPU power and economic means
    • Space Rogue: pcaps or it didn't happen
    • LtAramaki: Everyone thinks they understand SOLID, and when they discuss it with other people who say they understand SOLID, they think the other party doesn't understand SOLID. Take it as you will. I call this the REST phenomenon.
    • evaryont: I don’t see this as them [Google] trying to “seize” a corner of the web, but rather Google taking it’s paranoia to the next level. If they can’t ever trust anyone in the system [Certificate Authority], why not create your own copy of the system that no one else can use? Being able to have perfect security from top to bottom, similar to their recently announced custom chips they put in every one of their servers.
    • David Press: The benefits of SDN are less about latency and uptime and more about flexibility and programmability.
    • Benedict Evans: Web 2.0 was followed not by anything one could call 3.0 but rather a basic platform shift...one can see the rise of machine learning as a fundamental new enabling technology...one can see quite a lot of hardware building blocks for augmented reality glasses...so the things that are emerging at the end of the mobile S-Curve might also be the beginning of the next curve. 
    • @kevinmarks: 20% people have 0 microservices in production - the rest are already running microservices
    • @joeerl: You've got to be joking - should be 1M clients/server at least
    • SikhGamer: We considered using RabbitMQ at work but ultimately opted for SNS and SQS instead. Main reason being that we cared about delivering value and functionality. Over the cost of yet managing another resource. And the problems of reliability become Amazon's problem. Not ours.
    • DataStax: A firewall is the simplest, most effective means to secure a database. Sounds complicated, but it’s so easy a government agent could do it.
    • @danielbryantuk: "If you think good architecture is expensive, try bad architecture" @KevlinHenney #OOP2017
    • Peter Dizikes: The new method [wisdom from crowds] is simple. For a given question, people are asked two things: What they think the right answer is, and what they think popular opinion will be. The variation between the two aggregate responses indicates the correct answer.
    • Philip Ball: Looked at this way, life can be considered as a computation that aims to optimize the storage and use of meaningful information. So living organisms can be regarded as entities that attune to their environment by using information to harvest energy and evade equilibrium.
    • Ed Sutton: The study shows the effectiveness of personality targeting by showing that marketers can attract up to 63% more clicks and up to 1400% more conversions in real-life advertising campaigns on Facebook when matching products and marketing messages to consumers’ personality characteristics.
    • Pete Trbovitch: Today’s mobile app ecosystem most closely resembles the PC shareware era. Apps that are offered free to download can carry an ad-supported income model, paid extended content, or simply bonus features to make the game easier to beat. The bar to entry is as low as it’s ever been 
    • @BenedictEvans: Global mainframe capacity went up 4-5x from 2000-2010. ‘Dead’ technology can have a very long half-life
    • @searls: I keep seeing teams spend months building custom infrastructure that could be done in 20 minutes with Heroku, Github, Travis. Please stop.
    • @mdudas: Starbucks says popularity of its mobile app has created long lines at pickup counters & led to drop in transactions.
    • @cdixon: Software eats networking: Nicira (NSX) will generate $1B revenue for VMWare this year
    • raubitsj: With respect to vibration: we [Google] found vibration caused by adjacent drives in some of our earlier drive chassis could cause off-track writes. This will cause future reads to the data to return uncorrectable read errors. Based on Backblaze's methodology they will likely call out these drives as failed based on SMART or RAID/ReedSolomon sync errors.

  • Well this is different. GitLab live streamed the handling of their GitLab.com Database Incident - 2017/01/31. It wasn't what you would call riveting, but that's an A+++ for transparency. They even took audience questions during the process. What went wrong? The snippets function was DDoSd which generated a large increase of data to the database so the slaves were not able to keep up with the replication state. WAL transaction files that were no longer in the production backlog were being requested so transaction logs were missed. They were starting the copy again from a known good state then things went sideways. They were lucky to have a 6 hour old backup and that's what they were restoring too. Sh*te happens, how the team handled it and their knowledge of the system should give users confidence going forward.

  • OK, this turned out to be false, but nobody doubted it could be true or where things are going in the future. Hotel ransomed by hackers as guests locked out of rooms.

  • Interesting use of Lambda by AirBnB. StreamAlert: Real-time Data Analysis and Alerting. There's an evolution from compiling software using libraries that must be in the source tree; running software that requires downloading lots of package from a repository; and now using services that require a lot of other services to be available in the environment for a complex pipeline to run. StreamAlert just doesn't use Lambda, it also uses Kinesis, SNS, S3, Cloudwatch, KMS, and IAM. Each step is both a deeper level of lock-in and an enabler of richer functionality. What does StreamAlert do?: a real-time data analysis framework with point-in-time alerting. StreamAlert is unique in that it’s serverless, scalable to TB’s/hour, infrastructure deployment is automated and it’s secure by default. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Performance, Scalability, and High Availability: 3 Key Infrastructure Adaptability Requirements

Thu, 02/02/2017 - 21:31

This is a guest post by Tony Branson

Performance, scalability, and HA are often used interchangeably, and any confusion about them can result in unrealistic metrics and deployment delays. It is important to invest your time and understand the differences among these three approaches before you invest your money in resilient systems.

Performance

Categories: Architecture