Playing around with Amazon Echo (technical interface discussion)

@troy This Engadget article says that the NDA just ended for SDK developers. Is that true? Can we get some more info about what you have working? Maybe the community can even work up a full ST app for Echo…

1 Like

Sure, I’ll take that as authoritative for now :smile:

Here is the main file that’s doing the work of the json parsing and responses;

You can check out the rest of the repo… it just sends REST on/off commands to the typical basic oauth rest sample that’s out there for the smartthings side.

That happens here

If anyone recreates all this in smartthings that would be cool, but right now this app is like 99.9% ready to be multi-tenant, so it doesn’t need to be recreated.

its at http://echo-command.herokuapp.com

All we need to add is a way to get “your” smartthings oauth url into it, probably take 15 mins in a PR if someone wants, or I’ll try it next week or so - as I get more time

also check out utterances and intents in this folder:

6 Likes

This is exciting! I’ve had mine for about a week, been looking forward to this integration. Great work!

So, how does one actually get this on the Echo? I’ve authorized with SmartThings but not really sure what the next step is here.

1 Like

I’m at Amazon AWS Loft for a 3 1 hour seminar + networking time in San Francisco…

99% of attendees are male, but Alexa voice is female. Hmmm. :confused:

3 Likes

Need more women in IT and CS. Of course, Alexa might sound more like Pierce Brosnan then. :smile:

1 Like

Based on yesterday’s seminar, what are the odds of a ST integration?

PS. If Alexa’s voice doesn’t cut it, you could always try this male recorded voice, which is sarcastic and hits on the lady of the house.

I’m still on the invite list to order one :frowning:

If I can get my hands on one I will totally automate everything in the house with it!

The odds are very good – heck, 100%, but the “completeness (level)” and “quality” of the integration will depend on unknown factors, policies, and strategies of Amazon.

Some Random Observations (nothing official or verified, and based on minimal presentation and questions):

Based on CURRENT capabilities, many will change, some won’t…

  • Echo can only speak when spoken to (i.e., woken with the wake word, “Alexa” or “Amazon”). This means that the unit cannot be used like a Sonos to play aribitrary alert or informational messages (“The mail has arrived”, “An intruder may be trying to open the front-door.”). Echo could be programmed, however, to answer questions like “Are any of my windows open?”. i.e., it is information upon request only.

  • “Responses” include both spoken (synthesized) responses out of the speaker, as well as actions (lights on or off, mode change, etc.), and, finally, custom designed “Cards” that are sent to the Amazon Echo App. I perceive a world where the “Cards” can be directed to specific screens in the home (i.e., the nearest screen…). “Are any doors open?” could result in a spoken list of door names and/or a “Card” depicting the various doors with closed and open icons.

  • Casual 3rd Party integrations are similar to SmartApps … Approved developers get a sandboxed personal environment and can run the “Echo Apps” on their own Echos. I don’t think there is any way to share these to non-developers except through the Certification & Publication process.

  • These Echo Apps will provide the majority of functionality of Echo, I believe. The end-user, however, needs to use the name of the App in order to activate it, and if there are name conflicts of installed Apps, they would resolve these in advance; perhaps with pseudonyms (e.g., “Uber, pick me up.” and “Uber, extra large pepperoni pizza.” … where the latter is for Uber Pizzeria, would require renaming one of the Apps).

  • I don’t recall the exact phrasing that triggers a context to go to an App; but it does not happen through just normal language processing (e.g., “Call me a cab.” doesn’t work, you must specify the App name).

  • There must be a higher level of integration with some sort of “partnership agreement”; For example, the current Hue integration does not require the user to refer to the “Hue” app in your request… just “Bedroom lights on”. I’m assuming without a partnership (or without the implementation of some sort of defaults in the future), we SmartThings developers would require our users to say “Alexa, SmartThings bedroom lights on”, or “Alexa, SmartThings, hello I'm home”. – i.e., the name of the App is a mandatory keyword.

  • The presenter denied that “sponsored” results are a planned revenue strategy. In other words “Call me a cab” will not say “Would you like an Uber?” if Uber offers Amazon “a lot of money”. (Sure, right, yup, I totally believe you, Mr. Presenter.) Then again, sponsored search results is really Google and Microsoft’s domain and business model, right, not Amazon’s!

  • Personal Opinion: Sponsored results could be extremely valuable to SmartThings. If a user says “Alexa, turn on my lights” (just for the heck of it), Echo could respond “I'm sorry, but I don't know how to do that, but may I suggest the SmartThings Home Lighting Kit for only $199? Check the Echo App for details or say 'buy it' for a one click purchase.”.

  • Integration is Cloud-to-Cloud-to-Cloud: Amazon Echo Cloud - App Owners Cloud (could be an AWS cluster, of course) - SmartThings Cloud. Communicating within the LAN is not possible (no Echo to/from SmartHub V2 direct).

  • Utterance Definitions (I think the term is “Intents”, but I will update this post, I forget) and the resulting speech recognition and parsing, are really amazing and a little frightening. Regular expressions are not used, and the format of the definitions is just to give examples: Amazon’s magic cloud will infer the rest (and will get better and better at this). So if you give the example for a SmartThings temperature sensor integration: “What is the temperature in the {bedroom | $room}?” (and perhaps a couple similar examples as hints), Amazon Echo will automatically be able to parse the request “Is it cold in the basement?” before sending it to your API for processing.

  • So that’s a very, very rough example. The magic I’m implying is that the Echo parsing cloud is sometimes smart enough to recognize common variations on a phrase (What is the temperature? ~= Is it cold?), and it will get better at this over time as it learns language patterns from thousands of users. Other elements of the phrase are also parsed without needing complex regular expressions (bedroom ~= basement, but since this is tagged as a parameter, that will be passed to your API…).

  • Building “Goldilocks” lists of utterances (as short as possible, but long enough to give the magic parser accuracy and maximize functionality and natural language for the user), is an uncertain art. I would prefer that they provide a “hinting” language (or regex?) to make this less ambiguous; but, then again, magic technology will probably work very well since natural language processing inherently requires artificial-intelligence that rises to the level of magic and that is fundamental to Echo.

Conclusion

I don’t know what sort of arrangements have to be made between Amazon and SmartThings to establish an actual “partnership level” integration, but that is likely what should be done in the mid-term in order to give the most seamless experience. In the long-term, SmartThings or Samsung may provide their own speech processing service.

In the short-term, there is enough openness and API power in both SmartThings and Echo to allow Community Developers to develop quite useful integration, but they probably need to pay for hosting services … though perhaps like @625alex’s SmartTiles.click, maybe the SmartApp can provide sufficient end-points to essentially host it on the SmartCloud?

“Echo App” level integration (vs. some next higher level of “partner” integration) will always be handicapped a bit. It’s like the difference of what SmartTiles can do vs. what SmartThings official mobile apps and widgets can do now or in future versions. The SmartThings official mobile apps may lack certain features at any point in time, but the official apps (and any “partner” app developers, like Samsung Gear Watch?) will have super powers available.


Hope the above is interesting and not too sketchy – it’s just an outline. Questions and discussion welcome; though it will be easier to answer Questions after a few of us start playing with the dev kit, and we establish a relationship with some of the Amazon Echo platform / developer support engineers (I’ll invite them to join the Community, of course! :sunny:

5 Likes

Very interesting, Terry! Wish I had made it into the early devs list… no joy yet.

I already assumed Echo didn’t speak on its own - being able to ask “Is everything locked?” or “Is the heat on” etc is really quite satisfying as a UX IMHO. Very Star Trek - the computer/ship doesn’t speak on its own there, either.

I’m going to name my garage door “pod bay” so I can say “Alexa, open the pod bay doors”.

5 Likes

Ouch. That’s really a mouthful. It’d be a lot better if we could somehow avoid that…

1 Like

Yes… As more and more Echo Apps are created, there’s got to be ways to figure out how to automatically set the context.

I think it will be a bit like Android where you can set a “default app” for each intent (e.g., default alarm application, default browser, default weather service, default music player, … and default home automation service provider :arrow_right: SmartThings, of course!).

Thanks, very interesting!

Not sure whether the difference in real use is because of groups or because of the special arrangement with Phillips, but if you have a Phillips group called “bedroom lights,” you can just say “Alexa, turn on the bedroom lights.”

Here’s a video from some guy on YouTube (we’ll forgive him the Google glass since he gave clear detail where most people are skipping around) showing set up and use of Hues with echo:

p.s. Speaking just for myself, if Alexa recommends I buy a product when I ask a non shopping question I’m going to throw it out the window and write to whoever’s commercial ran telling them to stay out of my house. Just saying it’s not a brand message I would find likely to increase my purchases. :wink:

2 Likes

I am assuming that Philips either has a special arrangement (“partnership level integration”) with Amazon Echo, or, since they are the only integrated lighting control so far, it is the default. The Hue integration does not required Groups to work … you can just say the names of particular lights, including dimming them.

For some reason, I can turn on my Hue’s, but not off – gotta put in a support call.

Agreed… but, like all things Amazon, I wonder what their long term revenue model will be. Currently you get to play unlimited Prime Music Radio stations (so they get a bit of annual revenue from Prime membership fee), and also can buy music tracks with no-clicks.

I think that some sort of sponsorship will be too hard to resist … but, just like Google, the challenge is to make it unobtrusive. The current shopping list feature will obviously be linked to your Amazon account – perhaps automatically filling your shopping cart or even sending you the items daily without confirmation (and letting you cancel / refuse delivery at any time).

Offering Amazon Echo App developers revenue streams is also important … unless developers will have to actually pay to use the platform and can then have in-app purchases (like Uber).

It works the same way with Wemo, you just say the group name. This guy is using WeMo’s switches, just saying “Alexa, turn Kevin’s Lights Off.”

I’m thinking for official home automation integration, the device group is the context.

As far as revenue stream goes, a friend of mine who was an analyst at IBM used to say it was a mistake to look at Amazon as a retailer, from a business model standpoint, they function more like a mutual fund. Basically they make all their money off the stock price. Profits hardly ever enter into it. So as long as they’re doing “cool,” the stock price funds the company. :wink:

So Home Automation context requests (i.e., on, off, dim, … and perhaps in the future requests for status like “Are any of my doors open”) could just flow through the various installed Amazon Echo Apps that are tagged for Home Automation?

The presenter said pretty clearly that this is not the scenario; i.e., the Echo App name is critical.

And, for example, requests for music do not “flow” through (i.e., by default, it looks at your Amazon Music library only. You have to say “Play my Taylor Switch station from Pandora” in order to choose anything but the default service.

I will try to drop him (or someone) and email, and ask, because, yes, as you’ve observed, both Wemo and Hue do not require their “app names” to be spoken.

No way to know for sure, but I suspect the answer is going to be that if the devices can be added to the connected home section of the native Amazon echo app, then you don’t need the brand name:

You can now control WeMo switches and Philips Hue lights using your voice.

First, connect and register your devices to your Wi-Fi and set individual names in their respective WeMo or Hue app. Then add them to the Amazon Echo app by saying “Alexa, discover my appliances.” Newly discovered devices will be shown with the assigned name in the Amazon Echo app under “Settings > Connected Home,” where you can also create groups which can be activated via voice control.

And check out this guy. He has both Hue and WeMo, and he has a group that includes some of each. And he can just say “Alexa, turn on kitchen lights” and they all come on. So I think it has to be the groups feature under connected home in the native Amazon app. The question then becomes how does someone like smartthings get their controlled devices listed into those groups.

Music would work very differently.

1 Like

Well… The “Discovery” process (and use of groups) for Amazon Echo has been very inconsistent for me for the few times I’ve tried to setup so far.

The Echo vocally confirms discovery of my 6 “devices” (bulbs), but, except for 1 attempt, these do not show up in the Mobile App; therefore I cannot create or edit Groups.

In that 1 successful attempt (i.e., when the Devices were listed in the Mobile App), I created a group, but the Echo only turned on one bulb in the group and ignored my “off” request. Not a bad start, though.

When I renamed the group (as well as deleting and re-adding with new name), Echo said "cannot find a device or group with <newname>".

So that’s when I cleared (“Forget”) the entire Connected Devices and tried again.

There must be some sync issues between the Mobile App and the rest of the Echo system. I wonder if there is a workaround. Guess I have to try to replicate the behavior and check the Echo support and/or forums.

We just need a service running somewhere on the local network that responds to that :slight_smile: Need a network sniffer to see what dear old Echo is doing there.

Chatter has been that you need to press the button on the Hue bridge before asking Alexa to discover the devices. Not the official instructions, which are the other way around. FWIW

Yup… I’ve tried it both ways. Echo responds that it found 6 devices, and, I can even inconsistently control them … the problem is that the Mobile Echo App has only once shown me the list of devices (so that I can confirm their names, group them, etc.).

The trick may require initiating discovery from the Mobile App … but I’ve tried that too, and same result. I wish I could verbally as Echo for some more diagnostics (e.g., “Alexa, list my devices.”). Regardless, sync to the Mobile App is critical to proceed, really.