We’re aware of ongoing federation issues for activities being sent to us by lemmy.ml.

We’re currently working on the issue, but we don’t have an ETA right now.

Cloudflare is reporting 520 - Origin Error when lemmy.ml is trying to send us activities, but the requests don’t seem to properly arrive on our proxy server. This is working fine for federation with all other instances so far, but we have seen a few more requests not related to activity sending that seem to occasionally report the same error.

Right now we’re about 1.25 days behind lemmy.ml.

You can still manually resolve posts in lemmy.ml communities or comments by lemmy.ml users in our communities to make them show up here without waiting for federation, but this obviously is not something that will replace regular federation.

We’ll update this post when there is any new information available.


Update: Federation is resumed and we’re down to less than 5 hours lag, the remainder should be caught up soon.

The root cause is still not identified unfortunately.

  • MrKaplan@lemmy.worldM
    link
    fedilink
    English
    arrow-up
    5
    ·
    2 days ago

    I wouldn’t say usually, but they can happen from time to time for a variety of reasons.

    It can be caused by overly aggressive WAF (web application firewall) configurations, proxy server misconfigurations, bugs in Lemmy and probably some more.

    Proxy server misconfiguration is a common one we’ve seen other instances have issues with from time to time, especially when it works between Lemmy instances but e.g. Mastodon -> Lemmy not working properly, as the proxy configuration would only be specifically matching Lemmys behavior rather than spec-compliant requests.

    Overly aggressive WAF configurations tend to usually being a result of instances being attacked/overloaded either by DDoS or aggressive AI service crawlers.

    Usually, when there are no configuration changes on either side, issues like this don’t just show up randomly.

    In this case, while there was a change on the lemmy.ml side and we don’t believe a change on our side fell into the time this started happening (we don’t have the exact date for when the underlying issue started happening), while the behavior on the sending side might have changed with the Lemmy update, and other instances might just randomly not be affected. We currently believe that this is likely just exposing an issue on our end that already existed prior to changes on lemmy.ml, except the specific logic was previously not used.