Development Team

Vision

A world class development team of software engineers and managers who make our customers happy when using our product(s). Our products should contain broad rich features, high availaiblity, high quality, fast performance, trustworthy security, and reliable operation.

Mission

The development department strives to deliver MRs fast. MR delivery is a reflection of:

  • providing new product requirements
  • resolution of customer issues/bugs
  • fixing security problems
  • increasing availability, quality, and reliability
  • fostering open source community contributions
  • Improving user experience
  • fostering best agile practices for fast iterations

The department also focuses on career development and process to make this a preferred destination for high performing software engineers.

We use data to make decisions. If data doesn’t exist we use anecdotal information. If anecdotal information isn’t available we use first principles.

Product

We will continue our strong partnership with Product to make Bramble the best WFM platform on the planet. While we continue adding features to the product we must also work to identify technical debt and bring it to the prioritization discussion. We expect that Engineering manager is already addressing technical debt with our Product Manager. The benefits of technical debt retirement address maintaining/increasing feature velocity, increasing developer happiness, increasing contributions, reducing cost, etc.

Quality

Our team continues to own quality, through work with automation, tooling, and metrics.

Sales

We continue to support Sales initiatives and enable them through our fulfillment section.

UX

User experience is a continued focus area and competitive advantage. We support this effort both in the product development as well as in our architecture.

Infrastructure

Support for the scaling of our SaaS solution is an important component of our business. We continue the ongoing efforts to support requirements related to scaling requested from infrastructure.

Architecture

As we continue to scale the Bramble Product and consider additional business opportunities, our architecture will become more important. Architecture is crucial to our success and needs to be part of our thinking on a constant basis.

Organizational responsibilities

The development team is responsible for developing products in the following categories:

How We Work

Onboarding

Welcome to Bramble! We are excited for you to join us. Here are some curated resources to get you started:

Cross Functional Collaboration

Decisions requiring approvals

At Bramble we value freedom and responsibility over rigidity. However, there are some technical decisions that will require approval before moving forward. Those scenarios are outlined in our required approvals section.

Learning Resources

For a list of resources and information on our Bramble Learn channel for Development, consult this page.

Continuous Delivery

We eschew a regular (e.g. monthly release cadence) to a more continuous delivery model. This means issues being resolved with a constant flow.

Rapid Action

Rapid Action is a process we follow when a critical situation arises needing immediate attention from various stakeholders.

What deserves Rapid Action

Any problem with both high severity and broad impact is a potential Rapid Action. For example, a performance problem that causes latency to spike by 500%, or a security problem that risks exposing customer data.

What if it only affects one customer?

If the problem only affects one customer, consider a customer escalation process as the alternative.

Rapid Action Process

When a situation is identified as a potential Rapid Action the following actions are recommended:

  1. Identify the problem(s) to solve. Ensuring we have a data driven approach ensures we have a measurable way to quantify and track those metrics throughout the rapid action.
  2. Identify the exit criteria or goals that would resolve the stated problems of the rapid action. Highlight key dashboards or charts the DRI should be tracking to determine whether progress is trending in the right direction, so that adjustments to the work, goals, or allocation of individuals to the rapid action initiative can made as needed. Ideally we are able to define and agree to the exit criteria amongst stakeholders, prior to fully dedicating engineers to execute on the effort.
  3. Continue to iterate on the above step until the stated goals are achieved, or have the DRI continue to iterate on the goals of the rapid action. The DRI will be the one to work with stakeholders to discuss tradeoffs between progress made in the stated effort vs. the time commitment of Engineering DRIs.
Administrative Tasks
  1. Create an epic in the Bramble org group (This will link you to the epic creation page) group describing the problem and the resolution criteria as briefly and precisely as possible.
    1. Apply the rapid action label.
    2. If the problem is related to security, make the epic confidential.
    3. If possible, list existing issues that are in the scope of this Rapid Action.
  2. Identify the stakeholders involved and @-mention them on the epic. It is a good idea to over-communicate this problem, both for raising awareness and gathering ideas.
  3. Decide on a Directly Responsible Individual (DRI) and clearly note this in the epic description. This decision should be made as soon as possible, so leadership and responsibility is clear. However, because this decision must be made quickly, it should not be considered a final decision.
  4. Clear the schedules of the DRI and anyone else who is expected to be involved in resolving the problem. Rapid Actions are both important and urgent, so they should displace less important and urgent work. For example, if an engineer is asked to help resolve a Rapid Action, the deliverables currently assigned to them should be re-assigned or re-scheduled. Rapid Actions are stressful and time-consuming, so quickly shifting other work is a way to soften the impact.
  5. Set up a daily business day standup and invite the correct stakeholders and participants.

Optionally, to facilitate communication, you might:

  1. Share a Zoom link dedicated to an immediate discussion.
  2. Create a Slack room dedicated to ongoing discussion with the naming convention RA-#epic-id

Responsibilities of the DRI

The DRI is responsible for coordinating the effort to resolve the problem, from start to finish. In particular, the DRI is responsible for:

  • Maintaining the problem and resolution criteria in the epic description.
  • Running the daily standup related to the epic.
  • Tasking sub-DRIs as necessary. These sub-DRIs might be responsible for specific parts of the problem (part A/B/C), specific perspectives (Engineering/Infrastructure/Product), specific timezones (AMER/EUR/APAC), etc.
  • Ensuring that progress is being made toward mitigation and resolution. This may include coordinating problem-solving efforts across timezones so we can have Bramble team members working on the problem 24 hours a day.
  • Ensuring that work that is in-scope as part of a Rapid Action isn’t included in the Development Escalation Processes (a.k.a. infradev).
  • Updating stakeholders at least daily.
  • Detailing recommended follow-up actions once the problem has been solved (enhancements, refactoring, etc).

Please note that customers can be stakeholders. The DRI can seek assistance with customer communication in Slack at #support_escalations.

Status Update Template

The DRI should post a summary status update on the epic at least daily. The following format is recommended (provided here in Markdown format for easy copy/pasting into issues):

**YYYY-MM-DD Update**

**Progress since last update:**

This section describes what changes have been deployed to production and any other notable progress or accomplishments.

**Progress expected by next update:**

This section describes what you expect to accomplish prior to the next update.  For example what work is currently in progress (include links to MRs), when do you expect these to be deployed, what do you expect to be the effect(s)?

**Blockers:**

This section describes any specific obstacles preventing progress. What is needed to overcome them?  Are there team members (e.g. executives, domain experts) these concerns should be escalated to?

**Praise:**

This section is used to highlight specific praise for team members contributing to the Rapid Action.  It is important to [say thanks](/handbook/values/#say-thanks).

Resolution

Once the resolution criteria have been satisfied:

  1. Close the epic.
  2. Host a retrospective to understand what about the rapid action process could be improved. (Note: there could also be other retros that happen related to more specific sub-efforts of the rapid action, this retro should act as a touch point to ensure collaboration + communication worked.)
  3. Communicate the resolution to stakeholders.
  4. Consider awarding discretionary bonuses to the people who stepped in to help resolve the problem.

Email alias and roll-up

  1. Available email alias (a.k.a. Google group): Managers, Directors, VP’s teams: each alias includes everyone in the respective organization.
  2. Naming convention: team@brmbl.io, examples below -
    • Managers: configure@brmbl.io includes all the engineers reporting to the Configure backend engineering manager.
    • Directors: ops-section@brmbl.io includes all the engineers and managers reporting to the director of engineering, Ops.
    • VP of Development: development@brmbl.io includes all engineers, managers, and directors reporting to the VP of Development.
  3. Roll up: Teams roll up by the org chart hierarchy -
    • Engineering managers' aliases are included in respective Sub-department aliases
    • Sub-department aliases are included in Development alias

Development Escalation Process

Reducing the impact of far-reaching work

Because our teams are working in separate groups within a single application, there is a high potential for our changes to impact other groups or the application as a whole. We have to be cautious not to inadvertently impact overall system quality but also availability, reliability, performance, and security.

An example would be a change to user authentication or login, which might impact seemingly unrelated services.

Far-reaching work is work that has a large “blast radius” and includes changes to areas which will:

  1. be utilized by a high percentage of users
  2. impact entire services
  3. touch multiple areas of the application
  4. potentially have legal, security, or compliance consequences
  5. potentially impact revenue

If your group, product area, feature, or merge request fits within one of the descriptions above, you must seek to understand your impact and how to reduce it. Your plan might include creating a one-off process for those types of changes, such as:

  • Creating a rollout plan procedure
    • Consider how to reduce the risk in your rollout plan
    • Document how to monitor the rollout while in progress
    • Describe the metrics you will use to determine the success of the rollout
    • Account for different states of data during rollout, such as cached data or data that was in a previously valid state
  • Requiring feature flag usage
  • Changing a recommended process to a required process for this change, such as a domain expert review
  • Requesting manual testing of the work before approval