BBID Outage Failsafe

With the forced migration to BBID there is an issue that your ability to access your data disappears when BBID has an outage. I've brought this design flaw up on numerous occasions and no one seems to acknowledge it being a real issue (hence the reason I'm here). It is in fact so not acknowledged that there isn't even a category for BBID in this feature suggestion system. There is lots of focus on if MY identity provider goes down or if I configure it incorrectly but none on if BBID itself goes down (which it has quite a bit in recent months). I like the idea of BBID but there needs to be a backdoor/failsafe in case there is an outage. If BBID goes down, we can't access emergency contact information, medical information (such as allergies and medications), or even open a support ticket. TL;DR Backdoor/failsafe login for BBID outage
  • Stephen Alonso
  • Mar 30 2022
  • Attach files
  • John Vogel commented
    15 Apr, 2022 08:08pm

    Thanks Jonathan, you are correct.

    The uptime of Blackbaud ID is of the utmost importance and we have multiple redundancies/failovers in place for all critical processes.

    To highlight the importance Blackbaud ID, the long term vision is for Blackbaud ID to serve as the only account you need to access any Blackbaud solution. Specific to the education space think about how appreciative parents will be when they only need a single account to engage in all aspects of their child's education contained within a Blackbaud solution such as financial aid, tuition, enrollment, learning, volunteering, fundraising, events, etc.

  • Jonathan Tepper commented
    13 Apr, 2022 04:31pm

    Perhaps what needs to be communicated is there is are fail over systems in place. Similar when we did authentication inhouse with Active Directory, systems that signed into Active Directory via LDAP had a failover in place so if one active directory was not responding, it would go to another active directory. This was part of my organization's business continuity / disaster recover plan.


    Reading the thread, perhaps this needs to be communicated? I am sure part of Blackbaud's BBID plan there are many failover systems in place to mitigate issues to keep to their high level SLA. No system can be up 100% but it would be good to inform your clients and provide good assurance to us that the BBID systems have failover systems in place. I assume Blackbaud has this and voice in good faith to what has been posted in this thread already.

  • John Vogel commented
    12 Apr, 2022 03:32pm

    Stephen,

    My recent post to Brian addresses the heart of your recent response for emergency preparedness.


    There is one important piece that I do want to provide clarity on regarding your call out of "break glass" accounts. These are not to bypass the authentication of the app a user is accessing but bypass the authentication at a IdP that has a SSO connection to the app. The good news here is that at Blackbaud, we recommend our customers establish these accounts so that they can login if an issue arises with their SSO connection (the most common being a certificate expired and needs to be renewed with the IdP).


    All the best,

    John


  • John Vogel commented
    12 Apr, 2022 03:20pm

    Brian,

    Please know the overall experience of end users is always my top priority and serves as the north star for every decision I make as a product manager. Proof in point, here's a quote from my earlier response:

    "the impact incidents have on users not being able to sign in or sign up is what matters"

    With regards to information that "must be available at all times," it is important that your school's emergency preparedness and disaster recovery plans account for accessing critical information. For example, and talk about timing, just this morning when dropping my son off at daycare, the app parents use to sign their children in and out of school was having issues and the admins were manually keeping track of children arriving. So then I asked, “If the system isn’t available, what do you do about allergies and emergency contact information?” to which she replied, “Oh, don’t worry, we keep hard copies of all critical information.”


    It's also important to note that Blackbaud ID is just one piece of the overall process that is required to be stable to serve up critical information for your school. For example, there are upstream dependencies such as internet service providers that must be up to serve information. While rare, accidents can happen such as a fiber optic line getting cut that removes internet availability. This happened in Charleston in May 2021 and impacted some 911 services.


    To that end, we strongly encourage schools to include contingency plans for utility, internet, or vendor outages and incidents in their school’s “Policies and Procedures Guide.” For example, think through what your school should do if the tier 2 internet service provider to your whole region experienced an outage due to a natural disaster, cyber warfare attack, or construction accident. Each school’s plan will differ, but your plan could include monthly backups of vital medical and emergency contact data, exported to a USB drive or printed binder that you keep securely locked in a safe in the school administrator's office or the school nurse’s office. Consult your school’s legal advisor for recommendations about proper handling of personal data when saved to portable media devices or printed paper for emergency purposes. Finally, ensure all school staff, students, and families are aware of their responsibilities and the resources available to them before an emergency need arises.


    Blackbaud services including Blackbaud ID are available almost all the time but there are rare times when information stored within Blackbaud may be unavailable. This is why SaaS companies (such as Google, Microsoft and Blackbaud) have SLAs almost at 100% but not at 100% because incidents happen. I want to re-emphasize our commitment to consistent processes for managing incidents and investing in improvements. This is done to best mitigate risks to service continuity as they are identified.


    All the best,

    John

  • Stephen Alonso commented
    12 Apr, 2022 03:08pm

    April 11th, 2022: Incident ID 000345456

  • Stephen Alonso commented
    12 Apr, 2022 03:02pm

    Hi John,

    I would like to thank you for taking the time to reply to our concerns. As several others have mentioned your response doesn’t address the underlying problem but is just a boiler plate type reply about why BBID is good and the difference between incidents and outages (which to the end user is just semantics as a user who can’t sign in won’t care about the difference). I want to be clear that I do think Blackbaud ID is good and offers many benefits over the existing login mechanism, such as the ones you mention in the blog post. I also accept that when there is a disruption to BBID, rare as it may be, that everyone won’t be able to logon; but my concern isn’t directed toward the teacher trying to post an assignment, the bookkeeper trying to run a report, or the student trying to see their grade as BBID does come up quick enough that they can wait the few hours for it to be resolved. My concern has been, and still is, that in the event of an emergency there is not a way for a nurse or school administrator to access medical and emergency contact information. There are several ways the industry at large is still able to provide access to data securely when the authentication mechanism isn’t available. Many companies, such as Microsoft, accomplish this through “break-glass” accounts, which could be easily implemented by allowing a specific user account to continue to use the existing login mechanism and set up an email notification to all platform managers/organization admins that the account was used. You could even make it so the user can only access read-only views of specific emergency information to minimize the risk even more. There are many viable solutions to this concern, but your reply doesn’t seem to even entertain the possibility of alternate solutions. Hopefully Blackbaud will be more proactive on this issue rather than reactive to any student medical emergencies that might come up.

    -Stephen

  • Brian LeBlanc commented
    11 Apr, 2022 03:51pm

    John, respectfully, your message misses the mark entirely. Your response is from a perspective of security and data integrity, but your end users are looking at it as a significant issue that impacts emergency information for our constituents that simply must be available at all times.

    Please consider the following scenario: an incident is preventing users from logging into BBID. The reason is, frankly, irrelevant; all the end users know is that they can't access needed information, and whether it's an incident or an outage doesn't make a bit of difference. The school nurse reports a medical emergency that requires immediate contact of emergency services and parents, in that order. EMS requests a copy of the student's medical information so that they can stabilize the patient while in an ambulance en route to the hospital.

    Because BBID is down and staff can't log in, in theory staff would call the parents to obtain the information. However, we don't have access to the parent contact information, because - again - we can't access the contact card. EMS shows up at the school and we are unable to provide them the information requested. They are left flying blind in a medical emergency because we have no way to obtain critical, time-sensitive information.

    I recognize the importance of data integrity and protecting against nefarious events. However, part of a well-designed web of security is ensuring that a single point of failure does not take the whole thing down. The scenario described above suffers failure because of that single point. I don't know what the solution is, but the fact is that we desperately need something. Whether it's an encrypted backup to a local machine or a failsafe login for the site owner, or something else, we simply must have access to our data at all times. There is no tapdancing around it. "We have people working on it and on call 24/7 to fix it" is not an acceptable answer in an emergency.

  • Vicky Lopuchowycz commented
    11 Apr, 2022 11:44am

    We've has more outages and inability to work since moving to OnSuite and being hosted than when we hosted EE in house ourselves. It's terribly frustrating and the thought of moving to BBID for everyone is unsettling. There's always a company line of full investigation and regardless of it being an incident or an outage, it impedes the ability to get things done.

  • John Vogel commented
    8 Apr, 2022 08:38pm

    Hi Stephen,

    I'm the product manager at Blackbaud for Identity and Access Management on the SKY Platform where Blackbaud ID is a critical component.

    Before addressing your feature request, there are a few related topics I’d like to address from your post:

    • Migrating to Blackbaud ID: Directly benefits your organization and users for the many reasons noted in the Why? section of the blog post you linked to in a comment.

    • Incident management: I acknowledge that Blackbaud ID related incidents have occurred at a higher frequency this school year than in years past. With every incident, we perform a complete and thorough root cause analysis to inform what investments we will make in the future to prevent the issue recurrence. Your Account Executive or Customer Success Manager can provide you with feedback on any incident and the mitigating steps taken. While most of the downtime over the past year was due to upstream dependencies with 3rd party vendors, we recognize that it remains our responsibility to ensure continuity of service regardless of the underlying source of an incident. The impact incidents have on users not being able to sign in or sign up is what matters, and that is why Blackbaud has stringent SLAs in place to protect you, our customers, when unfortunate circumstances do arise. I want to emphasize our commitment to consistent processes for managing incidents and investing in improvements. This is done to best mitigate risks to service continuity as they are identified.

    • Incidents vs Outages: I’d like to take a moment to differentiate incidents from outages. Incidents include the universe of issues when a service degradation occurs, while outages are a subset of incidents where a complete disruption occurs. I want to be clear it is extremely rare for Blackbaud ID to experience an outage, and all the incidents we’ve seen this year involving Blackbaud ID have intermittently impacted a segment of customers and users.

    Regarding your feature request, identity and authentication serve as the tip of the spear when it comes to protecting your organization from bad actors and nefarious acts. As such, security is a top priority for authentication services and identity providers. This is true for providers across the industry including Blackbaud ID. As such, I will maintain the same path as the industry at large in not compromising security to alleviate impact of any temporary incident.


    All the Best,

    John

  • Stephen Alonso commented
    7 Apr, 2022 03:45pm

    April 7th, 2022: Incident ID 000344392

  • Stephen Alonso commented
    7 Apr, 2022 03:39pm

    Blackbaud just released a blog post about the move to BBID (https://community.blackbaud.com/blogs/17/8294). Make sure you make your voice known in the comment section.

  • William Jacoby commented
    4 Apr, 2022 03:20pm
    This has certainly happened to me more than once. Obviously there should be some kind of back door considering that students can't use a learning system they can't get into.
  • Carol Vila commented
    1 Apr, 2022 05:24pm

    It is critical to be able to access our student information during an outage such as the one we experienced on 3/30.

  • Rick Geyer commented
    1 Apr, 2022 01:40pm

    100% agree. There needs to be a way to access SIS data when BBID is experiencing issues.

  • Stephen Alonso commented
    30 Mar, 2022 04:16pm
    March 30th 2022: Incident ID 000341957 February 17th 2022: Incident ID 000330455
  • +43