Categories
Coding Devops Distributed Systems Getting Started in Coding Scala Software Engineering Uncategorized

Robinhood CLI for Quick Stock Exits + Multi-Factor Login Support

Screen Shot 2019-02-19 at 4.40.05 PM

What Robinhood is Missing

I’ve really enjoyed trading with Robinhood (NO FEES) these last few months, but viewing some metrics in their web UI and mobile app take too many clicks. For one, determining my percent return requires three clicks for each Stock position. This is too slow when you want to be quickly informed on when to make quick exits.

The CLI Tool

Preferably, I wanted a quick and dirty CLI tool in Python to crunch these numbers for me. I ended up finding a great open source Python framework to interact with Robinhood’s backend API and decided to make some tweaks to it to support:

  1. Streamlined Multi-factor Auth Login
  2. Improved Security with credentials as environment variables
  3. Calculate and display percent return on each position

You can find my working Fork here: https://github.com/parkergordonio/Robinhood


Running the Tool

Executing Script + Login Prompt + MFA Code: 

(I have a bash/zsh alias here “robin” pointing to the script get-metrics.sh)Screen Shot 2019-02-19 at 3.57.12 PM

Metrics Output:

By sorting each position on this newly generated percent return field, your eyes are quickly drawn to the positions that you may want to exit soon.

Screen Shot 2019-02-19 at 3.33.58 PM

Happy trading!


Note: There are plans to have this work merged back into source repo in some fashion.

Categories
Code camp Coding Containers Devops Distributed Systems Docker Getting Started in Coding LinkedIn reddit Scala Software Engineering Uncategorized Web

A Better Docker Container Tagging Strategy for CI/CD

Continuous delivery is difficult, but if your applications are containerized with Docker you’re moving in the right direction to make things easier! Containers provide a ton of flexibility and portability, but they can become a nightmare once you realize the pain of container management. One thing to make it easier is to have a standard container tagging strategy to provide common assumptions and vernacular amongst the team.

Container-Tagging-Logo
Docker container build pipelines, tagging strategies, and CI/CD should go hand-in-hand.

Do I Need a Better Tagging Strategy?

You might want to rethink your Docker Image tagging strategy if you don’t immediately know the answer to the following questions:

  1. “What git commit hash of our app is currently running in production?”
  2. “Which container version in our registry is currently running in production?”

Strategy: Release Candidate Lifecycle Tagging

The tagging method I find most attractive is what I call “Release Candidate Lifecycle Tagging”. The tag values should follow a flavor of release candidate terminology along the delivery pipeline similar to:

Build Stage Tag Value Development Stage
Initial Build
  • <Commit Hash>
  • unstable
“Alpha”
Passed Tests
(Contract, Integration, Service)
  • stable
“Release Candidate”
Deployed to Production +
Smoke Tested
  • live
“GA” (General Availability)

What it Looks like in a Build Pipeline:

In the following example of a release of the app “app”, the current git checkout sha hash is “ff613f”Initially building the Docker Image with the git sha hash is a pivotal piece that allows teams to know where/how to checkout the application for local or remote debugging of the exact version of the application.

Tagging-Flow-Diagram
A CI/CD build pipeline with incremental image tagging.

Taking it Further

Post-production Tags

With canary or blue/green deployments, additional tagging stages could be added incrementally to not only reflect that containers have made it to production, but that they reached levels of validity or traffic based performance metrics.

Retiring Images

Once an app image has been replaced by it’s subsequently upgraded version, the previous image needs to remain in the docker registry for an arbitrary amount of time in case a rollback is required. This can be accomplished by adding another tag after the image is retired like “retired-<RETIRED_DATE>”. Then, a reaping processes could take advantage of this new tag and only remove imagess that are X days old.

 


Credit

I want to give a shout out to Daniel Nephin, as his detailed and explanatory Github comments and issue discussions have led me to resolving many issues around Docker and container strategy.

Categories
Code camp Coding Distributed Systems First programming job Getting Started in Coding JAVA LinkedIn Scala Software Engineering Spark Uncategorized Web

From Junior to Senior: Software Engineering Must-Knows

* This is a living document and will be update over time*

Why these Resources?

Along a software developer’s journey from post-grad to seasoned vet, you come across articles and literature that enlighten you, propelling your skills forward by miles rather than inches. This is a collection of those essential resources that I feel a software engineer should know to be an informed, efficient, and effective engineer.

Contents

  1. Maintaining Clean Code
  2. Database Design
  3. Lean Engineering
  4. Testing
  5. Technical Decision Making
  6. Managing Deployments
  7. Container Orchestration
  8. JVM
    1. JVM Tuning
    2. Scala
  9. Machine Learning

Resources

1. Maintaining Clean Code

Clean Code (Book by Robert Martin)

“Clean Code” is one of those books that after reading it, you come out with an immediate feeling of both excitement (You know how to write maintainable code now!), and regret (you realize the code you have been writing your whole life is smelly!). While a few chapters are pretty dated technically, it successfully outlines sound practices to maintain hygienic object oriented codebases that can be borrowed for other programming paradigms. This book is a must-know!

https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882

Dependency Injection (DI)/Inversion of Control (IoC)

https://www.martinfowler.com/articles/injection.html

2. Database Design

Normalization

Normalization is easy to avoid early on, but tough to ignore its effects later down the road. When designing databases, five extra minutes spent thinking about and adhering to normalization will save days, if not weeks, later on in redesign and data integrity issue resolution. Trust me.
Short walkthrough on Normalization:

http://www.informit.com/articles/article.aspx?p=30646

3. Lean Engineering

Implementing Lean Software Development (Book by the Mary and Tom Poppendieck)

http://tinyurl.com/y9xdf7ed

4. Testing

Testing Quadrants

Those needing to prune, or cherry pick certain testing practices into their operations, can benefit from the diagram “Agile Testing Quadrants”. It outlines each test type’s organizational boundaries, initiation mechanism, and outcomes.

http://searchsoftwarequality.techtarget.com/tip/Agile-testing-quadrants-Guiding-managers-and-teams-in-test-strategies

Is Unit Testing Worth it?

Chances are you eventually started work at a company whose culture had a baked-in focus on quality, where you set off following orders to test, then realized the benefits later. For some, you are one of the testing thought-leaders at your organization and have to sell the benefit! This article gives you the points that express why unit testing is more than a nicety.

https://stackoverflow.com/questions/67299/is-unit-testing-worth-the-effort

Testing in a Microservices Architecture

https://martinfowler.com/articles/microservice-testing/

5. Technical Decision Making

Building Consensus Before Commitment

Encroaching on the famed “How to win friends and influence people” genre, this article explains how and why you should take a holistic approach to presentations and multi-org affecting decisions.

https://www.kitchensoap.com/2017/08/12/multiple-perspectives-on-technical-problems-and-solutions/

Technology Radar

A must in every developers exposure toolkit. The Thoughtworks team hand curates languages, frameworks, and practices organizations should adopt, trial, and assess.

https://www.thoughtworks.com/radar

Site Reliability Engineering Learnings

http://danluu.com/google-sre-book/

6. Managing Deployments

Continuous Integration

https://www.thoughtworks.com/continuous-integration

Git Workflows

https://www.atlassian.com/git/tutorials/comparing-workflows

Terraform Up-and-Running

While not critical to know intimately, Terraform is an amazing option as a multi PAAS hosting framework and Infra as Code management tool.

https://www.terraformupandrunning.com/

7. Container Orchestration

Kubernetes vs ECS

While this article will quickly grow stale, it is a great comparison of two of the leaders in cloud container orchestration and hosting.

https://platform9.com/blog/kubernetes-vs-ecs/

8. JVM

Scala

Class and Package Naming Strategies

While we all like to think we always execute the best file and class packaging practices, this naming and scoping refresher from Nikita Volkov can keep you sharp!

https://stackoverflow.com/questions/17121773/scalas-naming-convention-for-traits

Scala Interview Questions

https://www.journaldev.com/8958/scala-interview-questions-answers

Effective Scala

http://twitter.github.io/effectivescala

Profiling

Extensive Learnings from JVM Performance Tuning

https://www.infoq.com/presentations/JVM-Performance-Tuning-twitter

Profiling with VisualVM

This tool is awesome for investigating how JAVA options affect performance, and getting a feel for your apps overall health.

Walkthrough: https://www.youtube.com/watch?v=z8n7Bg7-A4I

https://visualvm.github.io/documentation.html

9. Machine Learning

10 Algorithms Software Engineers must know

https://www.kdnuggets.com/2016/08/10-algorithms-machine-learning-engineers.html

Disclaimer on References

The resources in this list are intended to be self referencing and imply the original authors are the ones that are due an immense amount of credit.

Think a resource should be added to this article? Please submit it here:

Categories
JAVA Scala Web Zookeeper

Zookeeper in AWS: Practices for High Availability with Exhibitor

Untitled drawing (1)

Overview

Zookeeper is a distributed sequentially consistent system developed to attack the many tough use cases surrounding distributed systems such as leader election in a cluster, configuration, and distributed locking. For more Zookeeper recipes visit: http://zookeeper.apache.org/doc/current/recipes.html. Zookeeper clusters(ensembles) can be made of of any number of nodes, but typically take the form of a three or five node ensemble with the minority of nodes able to fail and the Zookeeper service able to continue serving traffic.

How we use Zookeeper

On the Search team at Careerbuilder we use Zookeeper as a configuration service replacing static configuration file deployment. We also have plans to move to Solr Cloud, which requires a Zookeeper Ensemble for its election and configuration tasks. Below is a system diagram for the entire Zookeeper/Exhibitor service in AWS with an auto-scaling group and consuming client libraries.

Zookeeper-noLines.png

Difficulties with Zookeeper

While Zookeeper provides a high level of reliability and availability through redundancy and leader election patterns, it’s critical use cases introduce a high level of risk for outages and mass failures. To mitigate a lot of the risk, the open source Exhibitor application should be used as a supervisory process to monitor and restart each Zookeeper process, as well as provide data backups, recovery, and auto ensemble configuration. Exhibitor will be explained in more detail later.

Exhibitor: https://github.com/soabase/exhibitor/wiki.

Assumptions

  • Deployment will take advantage of AWS resources (Currently avoiding containerization)
  • Exhibitor will be deployed alongside Zookeeper to server as it’s supervisory process and management UI (https://github.com/soabase/exhibitor/wiki)

Deployment

Infrastructure as Code: Frameworks

Infrastructure automation can be achieved through many open and closed source framework providers such as Ansible, Puppet, and Chef. For deployments on the Search team at Careerbuilder we use Chef as the means for setting up environments on self hosted boxes in AWS. The Chef + Ruby combo is a robust tool in managing infrastructure as code and I highly suggest it: https://learn.chef.io/.

Artifacts

To deploy a Zookeeper ensemble the following artifacts and dependencies are required to be installed on each node in the cluster. After each artifact is built they can be uploaded to an artifact repository, but in our case an S3 bucket through a continuous integration build, to be later downloaded and installed by Chef during deployment.

  • JAVA

    • Zookeeper and Exhibitor are JAVA Virtual Machine(JVM) applications therefore JAVA must be installed on all machines running them.
  • Zookeeper

  • Exhibitor

    • Exhibitor can be run as a WAR file or with an embedded Jetty Server. After finding the Tomcat/WAR based server cumbersome we chose the Jetty based Exhibitor moving forward and it has paid off through it’s ease of configuration.
    • Steps on how to build the Exhibitor artifact can be found on Github: https://github.com/soabase/exhibitor/wiki/Building-Exhibitor

Cloud Formation

On a higher level of Infrastructure as code lies the ability to orchestrate resource allocation, bringup, and system integration. AWS CloudFormation is a means to perform these deployment actions internal to AWS.

Generic Templates

By generalizing CloudFormation template files for your specific use cases and parameterizing them, you can use a homegrown application or Integration processes to regularly phoenix or A-B deploy your resources with ease.

Here at Careerbuilder we have developed an in-house JAVA application known as Nimbus that pulls our generic CF Stack templates from Github, populates their parameters with values from parameter files and triggers a CloudFormation Stack creation in AWS. This abstracts a lot of CloudFormation’s unused Stack template complexity.

Cluster Discovery through Instance Tagging

In many cases it is required that you define machine instances indexes to each machine in a clustered distributed system either due to dataset sharding or static clustering. It is possible to deploy Zookeeper with exhibitor in a static ensemble that does not utilize a shared S3 exhibitor.properties configuration file(more on this later) and this would require instance tagging. Each node could be tagged with an attribute, say {“application”=”zookeeper-development”} by CloudFormation to be used as a selector during bring up. The Chef/Ansible/Puppet processes running on each node could then acquire the IP’s of each of the nodes in the ensemble by going through a query-check-sleep cycle in the local chef script until the correct number of Zookeeper boxes are up and running. However, this deployment model is clumsy and error prone, therefore the shared configuration file in S3 for exhibitor is highly suggested, as server IPs are added and removed from the config file dynamically.

Starting Exhibitor and Zookeeper

Exhibitor will start and restart the Zookeeper processes periodically during clustering tasks and rolling configuration changes. Therefore, all you need to do is start Exhibitor on each node, each node will starte Zookeeper, then register itself with the shared config file specified by the “–s3config” parameter. Below is some example ruby/chef code that can be used to start Exhibitor:

execute “Start Exhibitor” do
commandnohup java -jar #{node[:search][:exhibitor_dir]}/exhibitor.jar –hostname #{node[‘ipaddress’]} –configtype s3 –s3config #{bucket} –s3backup true > /opt/search/zookeeper/exhibitorNohup.log 2> /opt/search/zookeeper/exhibitorNohup.error.log &
action :run
end

Since Exhibitor manages the Zookeeper process it is important not to create any configuration files manually nor expect any manual configurations made to Zookeeper to persists after deployment or during runtime.

Exhibitor

Security

Securing Exhibitor/Zookeeper is an extensive topic left to the specific implementation of the reader. However, the Exhibitor Wiki lists command parameters that can be used to enable and configure security features within Exhibitor and suggest giving it a look.

https://github.com/soabase/exhibitor/wiki/Running-Exhibitor

Configuration

A typical exhibitor bucket root will look like the following, with an exhibitor.properties config file that you have uploaded containing a default level of configuration. I suggest also maintaining a ‘last known default’ config file in a “base-config” directory for backup of the exhibitor.properties file, in the event it gets corrupted.

null

An example exhibitor.properties file:

com.netflix.exhibitor-hostnames=
com.netflix.exhibitor-hostnames-index=0
com.netflix.exhibitor.auto-manage-instances-apply-all-at-once=1
com.netflix.exhibitor.auto-manage-instances-fixed-ensemble-size=5
com.netflix.exhibitor.auto-manage-instances-settling-period-ms=60000
com.netflix.exhibitor.auto-manage-instances=1com.netflix.exhibitor.backup-extra=
com.netflix.exhibitor.backup-max-store-ms=86400000
com.netflix.exhibitor.backup-period-ms=60000
com.netflix.exhibitor.check-ms=30000
com.netflix.exhibitor.cleanup-max-files=3
com.netflix.exhibitor.cleanup-period-ms=43200000
com.netflix.exhibitor.client-port=2181
com.netflix.exhibitor.connect-port=2888
com.netflix.exhibitor.election-port=3888
com.netflix.exhibitor.java-environment=
com.netflix.exhibitor.log-index-directory=/opt/search/zookeeper/logIndex/
com.netflix.exhibitor.log4j-properties=
com.netflix.exhibitor.observer-threshold=999
com.netflix.exhibitor.servers-spec=
com.netflix.exhibitor.zoo-cfg-extra=syncLimit\=5&tickTime\=2000&initLimit\=10
com.netflix.exhibitor.zookeeper-data-directory=/opt/search/zookeeper_data
com.netflix.exhibitor.zookeeper-install-directory=/opt/search/zookeeper
com.netflix.exhibitor.zookeeper-log-directory=/opt/search/zookeeper

Pro Tip: A great benefit we have found is the ability to modify Zookeeper Java Environment settings through Exhibitor for performance and Garbage Collection tuning.

Ensemble Registration

The following images outline the steps of the consensus process during Ensemble deployment.

Zk-Deploy-Step1-IntialGossipZk-Deploy-Step2-FirstNodeAddedZk-Deploy-Step3-IpIsPulledDownZk-Deploy-StepN-AllBoxesAreUp

Self Healing

The following images outline the event of a node failure in an auto-managed ensemble.

Zk-Recovery-Step1-NodeFallsOutOfServiceZk-Recovery-Step2-AsgBringsNodeUpZk-Recovery-Step3-NewNodeRegisteredZk-Recovery-Step4-DenialOfReEntry