Artificial Intelligence / Cognition Archives - The Robot Report https://www.therobotreport.com/category/design-development/ai-cognition/ Robotics news, research and analysis Fri, 06 Dec 2024 23:31:17 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://www.therobotreport.com/wp-content/uploads/2017/08/cropped-robot-report-site-32x32.png Artificial Intelligence / Cognition Archives - The Robot Report https://www.therobotreport.com/category/design-development/ai-cognition/ 32 32 Funding the next wave of robotics https://www.therobotreport.com/funding-the-next-wave-of-robotics/ https://www.therobotreport.com/funding-the-next-wave-of-robotics/#respond Fri, 06 Dec 2024 23:31:17 +0000 https://www.therobotreport.com/?p=581876 Episode features conversations with two VC's and explores robotics and AI investment trends.

The post Funding the next wave of robotics appeared first on The Robot Report.

]]>

 

In Episode 176 of The Robot Report Podcast, we feature an interview with venture capitalists Juliette Chevallier, Principle at Scale Ventures, and Jasmeet Singh founder of JMOON Ventures.

It’s VC week here at the podcast.

This episode features interviews with Juliette Chevallier from Scale Ventures and Jasmine Singh from Jay Moon Ventures and covers investment trends in robotics, emphasizing the importance of execution risk over technical risk.

Juliette Chevallier, Principal, Investments, Scale Venture Partners

Juliette Chevallier has a background in autonomous vehicles and robotics, having previously worked at companies like Google Chauffeur (now Waymo) and MIT spinoff Optimus Ride. She joined Scale Venture Partners about 2 years ago to lead their investment thesis on robotics, AI applications, and cybersecurity. Scale Venture Partners’ approach focuses on investing at the point of execution risk rather than technical risk, looking for companies with a working product and proven product-market fit. Juliette emphasizes the importance of understanding the customer ROI and business model as key criteria.

In her role as a VC, Juliette prefers to have a deep, hands-on involvement with portfolio companies, acting as a strategic sounding board and collaborating closely with founders to work through tough problems. She sees her role as helping founders navigate the operational and go-to-market challenges. Juliette notes a renewed interest in robotics from VCs, though she is cautious about some “wild” valuations and funding rounds, preferring bottoms-up market analysis over top-down figures.

Juliette is bullish on the potential of robotics foundation models (RFMs) to drive transformation, emphasizing the need for more multi-modal AI models that integrate vision, action, and communication. She is excited about the possibilities of AI to enhance robotics, but cautions about the risks of AI development burning through funding. Overall, Juliette’s approach focuses on de-risking execution and operational challenges for robotics startups, leveraging her deep technical and business expertise to support founders.

Learn more at: www.scalevp.com/

Jasmeet Singh, Founder, JMOON Ventures

Jasmeet Singh has a diverse background spanning robotics engineering, founding startups, and investing since 2012. As an investor at J Moon Ventures, he focuses on “physical AI” startups – those combining hardware, electronics, and AI in areas like robotics, IoT, and 3D printing.

Jasmeet emphasizes the importance of solving real problems, not just building cool technology. He looks for startups with a strong understanding of the user and business model, noting operational challenges like scaling manufacturing and finding the right business model.
Compared to the more risk-averse Canadian market, Jasmeet sees the US as a better environment for robotics fundraising. He advises founders to target large, underserved problems and focus on customer service and support.

Some of Jasmeet’s investments include Orange Wood Labs, Brisk AI, and Rural Hologram. As he launches J Moon Ventures, he is particularly interested in opportunities in agriculture, construction, medical, and sustainability.

Overall, Jasmeet brings a unique perspective as an investor with deep technical expertise and operational experience in robotics. He is focused on backing founders solving real-world problems with innovative hardware-software solutions.

Learn more at: jmoon.ventures/

Show timeline

  • 8:40 – News of the week
  • 26:38 – Interview with Juliette Chevallier
  • 1:03:00 – Interview with Jasmeet Singh, AKA The Bearded Maker

SITE AD for the 2025 Robotics Summit registration. Register now


News of the week

Humanoid video of the week

@plugfc7

Kai’s 1X Robot didn’t last long after getting rebooted #kaicenat #1x #1xrobot #fyp @Kai Cenat

♬ silence – moartea regelui.

Recent videos featuring internet influencer Kai Cenat and his 1X EVE robot have sparked a significant discussion about the readiness of humanoid robots for domestic use. In one particular incident (seen in the TiKTok video above), the robot abruptly powered down and fell over, raising concerns about potential safety hazards and the current limitations of humanoid technology. This event highlights the need for rigorous testing and development before deploying such robots in homes, as opposed to the more controlled industrial environments where they are currently being trialed.

ASTM developing testing standards for mobile manipulators

The ASTM F45 subcommittee is developing a new standard to evaluate the agility of mobile manipulators. This standard aims to provide a standardized testing procedure similar to automotive evaluations, allowing manufacturers to benchmark their solutions and identify areas for improvement. The proposed tests involve tracking a specific path on a table surface and inserting pegs, assessing the robot’s precision and coordination between arm and base movements. This initiative and other ASTM F45 efforts in mobile robot testing underscore the growing importance of standardized evaluation methods for advancing robotics technology.

GEODIS reaches 10M picks with Locus mobile robots

Locus Robotics and GEODIS have reached a major milestone with over 10 million units picked using autonomous mobile robots (AMRs) at a GEODIS distribution center in Pennsylvania. Locus’s AI-powered platform, LocusONE, optimizes worker productivity by directing them to the next pick location, reducing wasted time and boosting efficiency. This partnership highlights the increasing adoption of warehouse automation to meet growing e-commerce demands and improve operational efficiency.


2025 RBR50 Robotics Innovation Awards open for nominations

You can now submit nominations for the 2025 RBR50 innovation awards. They will recognize technology and business innovations in the calendar year 2024, and the awards are open to any company worldwide that produces robotics or automation.

The categories include:

  1. Technologies, products, and services: This category includes primary or applied research focusing on robotics and supporting technologies such as motion control, vision, or machine learning. It also includes new products and business, engineering, or technology services.
  2. Business and management: This category covers initiatives positioning a company as a market leader or an organization as an important thought leader in the robotics ecosystem. Significant mergers and acquisitions are relevant, as are supplier, partner, and integrator relationships.
  3. Applications and markets: The RBR50 will also recognize innovations that improve productivity, quality, and cost-effectiveness, as well as those that automate new tasks.

In addition, the 2025 RBR50 awards will celebrate the following:

  • Startup of the Year
  • Application of the Year
  • Robot of the Year
  • Robots for Good Award

The deadline for submissions is Friday, Dec. 20, 2024.


Podcast sponsored by FlexQube

The show this week is sponsored by FlexQube. Move material with any size, shape, and weight with the FlexQube Navigator AMR, the world’s first multi-purpose and non-load carrying robot.

The FlexQube Navigator AMR features a standardized coupling interface to connect with an ecosystem of different load carriers depending on the customer’s needs.

The system also features a safety-rated identification of load carrier footprint to secure a safe and efficient scale-up of different use cases in a factory or warehouse. 

FlexQube Navigator – robotics that delivers! 

To learn more about FlexQube’s solutions goto: https://www.flexqube.com 


 

The post Funding the next wave of robotics appeared first on The Robot Report.

]]>
https://www.therobotreport.com/funding-the-next-wave-of-robotics/feed/ 0
Hello Robot’s Stretch AI toolkit explores embodied intelligence https://www.therobotreport.com/stretch-ai-toolkit-explore-embodied-intelligence/ https://www.therobotreport.com/stretch-ai-toolkit-explore-embodied-intelligence/#respond Fri, 06 Dec 2024 21:26:46 +0000 https://www.therobotreport.com/?p=581871 Stretch AI is a powerful toolkit designed to help researchers and developers create intelligent behaviors for Hello Robot's Stretch 3 mobile manipulator.

The post Hello Robot’s Stretch AI toolkit explores embodied intelligence appeared first on The Robot Report.

]]>

Hello Robot released an open-source collection of tools, tutorials, and reference code called Stretch AI that empowers developers to explore the future of embodied AI on the Stretch 3 mobile manipulator. Stretch 3, released in February 2024, is gaining traction with university labs as a both a platform for research about AI research and real-world deployments.

This comes on the heels of the advancement of robot utility models as a precursor for embodied AI capabilities. Available policies include ACT, VQ-BeT, and Diffusion Policy.

Stretch AI is a powerful toolkit designed to empower researchers and developers to create intelligent behaviors for the Stretch 3 mobile manipulator. This platform offers a range of capabilities, including:

  • Code for precise grasping and manipulation
  • Advanced mapping and navigation techniques
  • Integration with LLM agents for sophisticated decision-making, seamless text-to-speech and speech-to-text functionality
  • Robust visualization and debugging tools to streamline development and testing processes.

Stretch AI integrates open-source AI models, allowing it to accomplish home tasks with natural verbal requests such as “Stretch, pick up the toy, and put it in the bin.” There is a dedicated GitHub repo for Stretch AI.

“With Stretch AI, we wanted to open up access to the latest Embodied AI techniques and make them available to the fast-growing community of Stretch developers,” said Chris Paxton, senior embodied AI lead at Hello Robot. “We’re moving towards a world where robots can perform complex, multi-step tasks in homes. Stretch AI advances the ability to simply develop autonomous systems such as these using AI.”


SITE AD for the 2025 Robotics Summit registration. Register now


Taking AI from labs to living rooms

“Thanks to advances in AI, general-purpose home robots like Stretch are developing faster than expected,” said Hello Robot CEO Aaron Edsinger. “However, it is uncommon to see these robots actually working in real homes with real people. With Stretch AI, roboticists can take their work from the lab and begin developing real applications for realistic home settings.”

Stretch AI offers a distinct vision of the future in which AI-powered robots benefit everyone, including older adults, children, and people with disabilities. “Homes are an inclusive place. To truly succeed in homes, robots, and the AI that powers them, should be made for everyone,” said Edsinger.

Hello Robot said its Stretch mobile manipulator is used by developers in 20 countries, from leading universities to innovative companies. With Stretch AI, Hello Robot invites the research community to collaborate on shaping the future of embodied intelligence.

The Stretch 3 is priced at $24,950 and is available on Hello Robot’s website.

Stretch 3 is portable, lightweight, and designed from the ground up to work around people.

Stretch 3 is portable, lightweight, and designed from the ground up to work around people. | Credit: Hello Robot

The post Hello Robot’s Stretch AI toolkit explores embodied intelligence appeared first on The Robot Report.

]]>
https://www.therobotreport.com/stretch-ai-toolkit-explore-embodied-intelligence/feed/ 0
AMP Robotics raises $91M to accelerate deployment of recycling systems https://www.therobotreport.com/amp-robotics-raises-91m-accelerate-deployment-recycling-systems/ https://www.therobotreport.com/amp-robotics-raises-91m-accelerate-deployment-recycling-systems/#respond Thu, 05 Dec 2024 15:14:19 +0000 https://www.therobotreport.com/?p=581856 AMP Robotics will use its latest funding to deploy AMP ONE system, which is designed to improve sortation of municipal solid waste.

The post AMP Robotics raises $91M to accelerate deployment of recycling systems appeared first on The Robot Report.

]]>
AMP ONE is designed to make recycling of municipal solid waste, shown here, more economical.

AMP ONE is designed to capture billions of dollars in value otherwise lost to landfills or incineration annually. Source: AMP Robotics

AMP Robotics Corp. today said it has  has raised $91 million in corporate equity in a Series D financing. The Louisville, Colo.-based company plans to use its latest funding to accelerate deployment of its AMP ONE systems, which uses artificial intelligence and robotics to sort municipal solid waste, or MSW.

“Recycling rates have stagnated in the United States, despite the positive benefits recycling offers local economies and the environment,” said Matanya Horowitz, founder of AMP. “This latest investment enables us to tackle larger projects and deliver real outcomes for waste companies and municipalities – by lowering sortation costs, capturing more material value, diverting organic waste, and extending landfill life – all while helping the industry optimize its strategic assets.”

Founded in 2014, AMP Robotics said its AI platform has identified 150 billion items and guided the sortation of more than 2.5 million tons of recyclables. The company said its technology can help modernize and change the economics of resource recovery. It has three full-scale facilities and more than 400 AI systems deployed across North America, Asia, and Europe.

From sortation to AMP ONE

AMP Robotics said its AI uses deep learning to continuously train itself by processing millions of material images into data. The software uses pattern recognition of colors, textures, shapes, sizes, and logos to identify recyclables and contaminants in real time, enabling new offtake chemistries and capabilities, it added.

The company noted that its first products were a series of sorting robots deployed with minimal retrofit into existing recycling facilities. AMP then developed facilities that it claimed involve almost no manual sorting, are reliable, and provide “pervasive data.”

“These facilities make the recovery of commodities safer and more cost-effective than ever and have grown to encompass MSW sorting, an offering out of reach to the industry prior to the advent of AMP’s technology,” it said. “AMP ONE provides a full-scale facility solution to sort various material streams and capture more of the billions of dollars in value otherwise lost to landfills or incinerated annually.”


SITE AD for the 2025 Robotics Summit registration. Register now


AMP Robotics marks recent deployments, new CEO

Recycling and Disposal Solutions demonstrated AMP ONE’s ability to cost-effectively sort MWS at its facility  in Portsmouth, Va. It has processed 150 tons per day of local waste with more than 90% uptime, said the company.

Last month, AMP Robotics entered into an agreement with Waste Connections Inc. to equip and operate one of Waste Connections’ single-stream recycling facilities in Colorado. 

“AMP provides meaningfully lower-cost, higher-performance systems to recover commodities and increase landfill diversion, and we’re uniquely positioned to reshape the waste and recycling landscape at a critical time,” said Tim Stuart, CEO of AMP. “We’re grateful to our longstanding and newest investors for their support in helping us chart a new path for sustainable materials management and resource efficiency.”

AMP last month augmented its leadership team with the appointment of Stuart, former chief operating officer for Republic Services Inc. Horowitz transitioned from CEO into the role of chief technology officer.

Congruent Ventures leads round

Congruent Ventures led AMP Robotics’ Series D round. Current and new investors participated, including Sequoia Capital, XN, Blue Earth Capital, Liberty Mutual Investments, California State Teachers Retirement System (CalSTRS), Wellington Management, Range Ventures, and Tao Capital Partners.

“AMP’s AI sortation systems enable consumers to recycle both with and without curbside separation and communities to benefit from the recovery of recycled commodities while reducing dependence on landfills,” added Abe Yokell, co-founder and managing partner of Congruent Ventures. “AMP is an example of the real-world impacts of AI; solutions like AMP’s will divert billions of tons of recyclable material from landfills while reducing emissions.”

Congruent Ventures is a leading early-stage venture firm focused on partnering with entrepreneurs to build companies addressing climate and sustainability challenges. The firm has more than $1 billion in assets under management across early-stage climate tech funds and 59 companies in its portfolio.

The post AMP Robotics raises $91M to accelerate deployment of recycling systems appeared first on The Robot Report.

]]>
https://www.therobotreport.com/amp-robotics-raises-91m-accelerate-deployment-recycling-systems/feed/ 0
Project CETI uses AI and robotics to track down sperm whales https://www.therobotreport.com/project-ceti-uses-ai-and-robotics-to-track-down-sperm-whales/ https://www.therobotreport.com/project-ceti-uses-ai-and-robotics-to-track-down-sperm-whales/#respond Tue, 03 Dec 2024 21:19:23 +0000 https://www.therobotreport.com/?p=581810 Project CETI researchers developed the AVATARs framework to make the most out of the small amount of time sperm whales spend on the surface.

The post Project CETI uses AI and robotics to track down sperm whales appeared first on The Robot Report.

]]>
An image of a pod of sperm whales swimming underwater.

Sperm whales spend, on average, 10 minutes of every hour on the surface, presenting challenges for researchers studying them. | Source: Amanda Cotton/Project CETI

In the chilly waters off the New England coast, researchers from the Cetacean Translation Initiative, Project CETI, can spend hours searching and waiting for an elusive sperm whale to surface. During the minutes the whales spend above water, the researchers need to gather as much information as possible before the animals dive back beneath the surface for long periods.

With one of the widest global distributions of any marine mammal species, these whales are difficult to track down, and even more difficult to learn from. Project CETI aims to use robotics and artificial intelligence to decode the vocalizing of sperm whales. It recently released research about how it tracks down sperm whales across the wide ocean.

“The ocean and the natural habitat of the whales is this vast place where we don’t have a lot of infrastructure, so it’s hard to build infrastructure that will always be able to observe the whales,” said Stephanie Gil, an assistant professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and an advisor on the project.

The project brings together some of the world’s leading scientists in biology, linguistics, robotics, and more. The founder of Project CETI, David Gruber, estimated that it’s one of the largest multi-disciplinary research projects active today.

“Project CETI was formed in March 2020, and we’re now over 50 scientists across eight different disciplines,” he said. “I think we’re over 15 institutions, which I believe puts us as one of the most interdisciplinary, large-scale science projects that’s ever been conducted. It’s incredibly rewarding to see so many disciplines working together.”

Project CETI shares latest research

The researchers at the nonprofit organization have developed a reinforcement learning framework that uses autonomous drones to find sperm whales and predict where they will surface. The paper, published in Science Robotics, said it’s possible to predict when and where a whale may surface using various sensor data and predictive models of sperm whale dive behavior.

This new study involved various sensing devices, such as Project CETI aerial drones with very high frequency (VHF) signal sensing capability that use signal phase along with the drone’s motion to emulate an “antenna array in the air” for estimating the direction of pings from CETI’s on-whale tags.

“There are two basic advantages of [VHF signals]. One is that they are really low power, so they can operate for a really, really long time in the field, like months or even years. So, once those small beacons are deployed on the tag, you don’t have to really replace the batteries,” said Ninad Jadhav, a co-author on the paper and a robotics and engineering Ph.D. student at Harvard University.

“The second thing is these signals that these tags transmit, the VHF, are very high-frequency signals,” he added. “They can be detected at really long ranges.”

“That’s a really huge advantage because we never know when the whales will surface or where they will surface, but if they have been tagged before, then you can sense, for example, simple information such as the direction of the signal,” Jadhav told The Robot Report. “You can deploy an algorithm on the robot to detect that, and that gives us an advantage of finding where the whales are on the surface.”

Sperm whales present unique challenges for data collection

From left to right, Stephanie Gil, Sushmita Bhattacharya, and Ninad Jadhav working on a laptop with an orange drone in the foreground.

From left to right: Stephanie Gil, Sushmita Bhattacharya, and Ninad Jadhav. | Source: Stu Rosner

“Sperm whales are only on the surface for about 10 minutes every hour,” said Gil. “Other than that, they’re diving pretty deep in the ocean, so it’s hard to access information about what the whales are actually doing. That makes them somewhat elusive for us and for science.”

“Even we humans have certain patterns day to day. But if you’re actually out observing whales on a particular day, their behavior is not going to exactly align with the models, no matter how much data you’re using to make those models right. So it’s very difficult to really predict with precision when they might be coming up,” she continued.

“You can imagine, if [the scientists] out on the water for days and days, only having a few encounters with the whales, we’re not being that efficient. So this is to increase our efficiency,” Gruber told The Robot Report.

Once the Project CETI researchers can track down the whales, they must gather as much information as possible during the short windows of time sperm whales spend on the surface.

“Underwater data collection is quite challenging,” said Sushmita Bhattacharya, a co-author on the paper and a computer science and robotics Ph.D. student at Harvard University. “So, what is easier than underwater data collection is to have data collected when they’re at the surface. We can leverage drones or shallow hydrophones and collect as much data as possible.”


SITE AD for the 2025 Robotics Summit registration. Register now


Developing the AVATARS framework

At the center of the research is the Autonomous Vehicles for Whale Tracking And Rendezvous by Remote Sensing, or AVATARS framework. AVATARS is the first co-development of VHF sensing and reinforcement learning decision-making for maximizing the rendezvous of robots and whales at sea.

“We tried to build up a model which would kind of mimic [sperm whale] behavior,” Bhattacharya said of AVATARS. “We do this based on the current information that we gather from the sparse data set.”

Being able to predict when and where the whales will surface allowed the researchers to design algorithms for the most efficient route for a drone to rendezvous with—or encounter—a whale at the surface. Designing these algorithms where challenging on many levels, the researchers said.

“Probably the hardest thing is the fact that it is such an uncertain problem. We don’t have certainty at all in [the whales’] positions when they’re underwater, because you can’t track them with GPS when they’re underwater,” Gil said. “You have to think of other ways of trying to track them, for example, by using their acoustic signals and an angle of arrival to their acoustic signals that give you a rough idea of where they are.”

“Ultimately, these algorithms are routing algorithms. So you’re trying to route a team of robots to be at a particular location in the environment, in the world, at a certain given time when it’s necessary to be there,” she told The Robot Report. “So this is analogous to something like rideshare.”

Before bringing the algorithms into the real world with real whales, the team tested them in a controlled environment with devices the team put together to mimic whales.

We mimicked the whale using an engineered whale,” recalled Bhattacharya. “So basically we used a speed boat, and it had a loud engine. We used that engine noise to mimic the whale vocalization, and we had it move to mimic whale motion. And then we used that as our ground test.”

Project CETI tests AVATARS in the real world

An image of a small white drone flying over the ocean. The top of a whale can be seen poking out of the water.

A customized off-the-shelf drone flying to deploy a whale tag developed by Project CETI researchers. | Source: Project CETI

“Every day was a challenge when we were out on the boat, because this was for me, and my co-author Sushmita, the first time we were deploying real autonomous robots from a boat in the middle of the sea trying to collect some information,” Jadhav said.

“One of the major challenges of working in this environment was the noise in the sensor,” he continued. “As opposed to running experiments in the lab environment, which is more controlled, there are fewer sources of noise that impact your experiments or your sensor data”

“The other key challenge was deploying the drone itself from the board,” noted Jadhav. “I remember one instance where this was probably the first or second day of the second expedition that we went on last November, and I had the drone ready. It had the payload. It was waterproof”

“I had already run experiments here in Boston locally, where I had an estimate of how long the drone would fly with the payload. And then we were out on the boat running some initial tests, and the drone took off,” he said. “It was fine, it was doing its thing, and within a minute of it collecting data, there was a sudden gust of wind. The drone just lost control and crashed in the water.”

The team also had to try to predict and react to whale behavior when performing field tests.

“Our algorithm was designed to handle sensor data from a single whale, but what we ended up seeing is that there were four whales together, who were socializing,” Jadhav said. “They were diving and then surfacing at the same time. So, this was tricky, because then it becomes really hard for us on the algorithm side to understand which whale is sending which acoustic signal and which one we are tracking.”

Team tries to gather data without disturbing wildlife

While Project CETI works closely with sperm whales and other sea life that might be around when the whales surface, it aims to leave the whales undisturbed during data collection.

“The main concern that we care about is that even if we fail, we should not harm the whales,” Bhattacharya said. “So we have to be very careful about respecting the boundaries of those animals. That’s why we are looking at a rendezvous radius. Our goal is to go near the whale and not land on it.”

“Being minimally invasive and invisible is a key part of Project CETI,” said Gruber. “[We’re interested in] how to collect this information without interacting directly with the whale.”

This is why the team works mostly with drones that won’t disturb sea life and with specially developed tags that latch onto the whales and collect data. The CETI team eventually collects these tags, and the valuable data they contain, after they fall off the whales.

“A lot of times, people might think of robotics and autonomy as a scary thing, but this is a really important project to showcase that robots can be used to extend the reach of humans and help us understand our world better,” Gil told The Robot Report.

Project CETI aims to decode whale communications

This latest research is just one step in Project CETI’s overarching goal to decode sperm whale vocalizations. In the short term, the organization plans to ramp up data collection, which will be crucial for the project’s long-term goals.

“Once we have all the algorithms worked out, a future outlook is one where we might have, for example, drone ports in the sea that can deploy robots with sensors around the clock to observe whales when they’re available for observation,” Gil said.

“We envision a team of drones that will essentially meet or visit the whales at the right place, at the right time,” Jadhav said. “So whenever the whales surface, you essentially have a kind of autonomous drone, or autonomous robot, very close to the whale to collect information such as visual information or even acoustic if the drone is equipped with that.”

Outside of Project CETI, organizations could use AVATARS to further protect sperm whales in their natural environments. For example, this information could be used to reroute ships away from sperm whale hot spots, reducing the odds of a ship colliding with a pod of sperm whales.

“The idea is that if we understand more about the wholes, more about the whale communities, more about their social structures, then this will also enable and motivate conservation projects and understanding of marine life and how it needs to be protected,” Gil said.

In addition, the researchers said they could apply these methods to other sea mammals that vocalize.

“Here at Project CETI, we’re concerned about sperm whales, but I think this can be generalized to other marine mammals, because a lot of marine mammals vocalize, including humpback whales, other types of whales, and dolphins,” Bhattacharya said.

The post Project CETI uses AI and robotics to track down sperm whales appeared first on The Robot Report.

]]>
https://www.therobotreport.com/project-ceti-uses-ai-and-robotics-to-track-down-sperm-whales/feed/ 0
AWS offers accelerated robotics simulation with NVIDIA https://www.therobotreport.com/aws-offers-accelerated-robotics-simulation-nvidia/ https://www.therobotreport.com/aws-offers-accelerated-robotics-simulation-nvidia/#respond Tue, 03 Dec 2024 18:30:07 +0000 https://www.therobotreport.com/?p=581816 AWS and NVIDIA said that Isaac Sim on Amazon Web Services can significantly accelerate and scale robot simulation and AI training.

The post AWS offers accelerated robotics simulation with NVIDIA appeared first on The Robot Report.

]]>
AWS and Isaac Sim can help accelerate robotics development, says NVIDIA.

AWS and Isaac Sim can help accelerate robotics development, says NVIDIA.

NVIDIA Corp. today announced at AWS re:Invent enhanced tools for robotics developers, as well as the availability of NVIDIA DGX Cloud on Amazon Web Services and offerings for artificial intelligence and quantum computing.

The company said that NVIDIA Isaac Sim is now available on NVIDIA L40S graphics processing units (GPUs) in Amazon Elastic Cloud Computing (EC2) G6e instances. It said this could double scaling robotics simulation and accelerate AI model training. Isaac Sim is a reference application built on NVIDIA Omniverse for developers to simulate and test AI-driven robots in physically based virtual environments.

With NVIDIA OSMO, a cloud-native orchestration platform, developers can easily manage their complex robotics workflows across their AWS computing infrastructure, claimed the company.

“This combination of NVIDIA-accelerated hardware and software — available on the cloud — allows teams of any size to scale their physical AI workflows,” wrote Akhil Docca, senior product marketing manager for Omniverse at NVIDIA.


SITE AD for the 2025 Robotics Summit registration. Register now


What is ‘physical AI?’

According to NVIDIA, “physical AI” describes AI models that can understand and interact with the physical world. The company said it “embodies the next wave of autonomous machines,” such as self-driving cars, industrial manipulators, mobile robots, humanoids, and even robot-run infrastructure like factories and warehouses.

With physical AI, developers are embracing a “three-computer solution” for training, simulation, and inference to make breakthroughs, NVIDIA said. Yet physical AI for robotics systems requires robust training datasets to achieve precision inference in deployment. Developing such datasets and testing them in real situations can be impractical and costly.

Simulation offers an answer, as it can accelerate the training, testing and deployment of AI-driven robots, the company asserted.

L40S GPUs in the cloud offer to scale simulation, training

Developers can use simulation to verify, validate, and optimize robot designs as well as the systems and their algorithms before deployment, said NVIDIA. It added that simulation can optimize facility and system designs before construction or remodeling starts for maximum efficiencies, reducing costly manufacturing change orders.

Amazon EC2 G6e instances accelerated by NVIDIA L40S GPUs can double performance over the prior architecture, while allowing the flexibility to scale as scene and simulation complexity grows, NVIDIA said. Roboticists can use these instances to train many computer vision models that power AI-driven robots.

This means the same instances can be extended for various tasks, from data generation and simulation to model training. NVIDIA added that OSMO allows teams to orchestrate and scale complex robotics development workflows across distributed computing resources, whether on premises or in the AWS cloud.

NVIDIA said Isaac Sim can foster collaboration and critical workflows, such as generating synthetic data for perception model training.

A reference workflow combines NVIDIA Omniverse Replicator, a framework for building custom synthetic data generation (SDG) pipelines and a core extension of Isaac Sim, with NVIDIA NIM microservices. With it, developers can build generative AI-enabled SDG pipelines, it said.

These include the USD Code NIM microservice for generating Python USD code and answering OpenUSD queries, plus the USD Search NIM microservice for exploring OpenUSD assets using natural language or image inputs.

The Edify 360 HDRi NIM microservice can generate 360-degree environment maps, while the Edify 3D NIM microservice can create ready-to-edit 3D assets from text or image prompts. Generative AI can thus ease the synthetic data generation process by reducing many tedious and manual steps, from asset creation to image augmentation, said NVIDIA.

  • Rendered.ai’s synthetic data engineering platform is integrated with Omniverse Replicator. It enables companies to generate synthetic data for computer vision models used in industries from security and intelligence to manufacturing and agriculture.
  • SoftServe Inc., an IT consulting and digital services provider, uses Isaac Sim to generate synthetic data and validate robots used in vertical farming with Pfeifer & Langen, a leading European food producer.
  • Tata Consultancy Services is building custom synthetic data generation pipelines to power its Mobility AI suite to address automotive and autonomous use cases by simulating real-world scenarios. Its applications include defect detection, end-of-line quality inspection, and hazard avoidance.

NVIDIA, AWS help robots learn in simulation

While Isaac Sim enables developers to test and validate robots in physically accurate simulation, Isaac Lab, an open-source robot learning framework built on Isaac Sim, provides a virtual playground for building robot policies that can run on AWS Batch. Because these simulations are repeatable, developers can troubleshoot and reduce the number of cycles required for validation and testing, said NVIDIA.

The company cited robotics startups that are already using Isaac Sim on AWS: 

  • Field AI is building robot foundation models to enable robots to autonomously manage a wide range of industrial processes. It uses Isaac Sim and Isaac Lab to evaluate the performance of these models in complex, unstructured environments in construction, manufacturing, oil and gas, mining, and more.
  • Vention, which offers a full-stack cloud-based automation platform, is creating pretrained skills to ease development of robotic tasks, noted NVIDIA. It is using Isaac Sim to develop and test new capabilities for robot cells used by small to midsize manufacturers.
  • Cobot offers Proxie, its AI-powered collaborative mobile manipulator. It uses Isaac Sim to enable the robot to adapt to dynamic environments, work alongside people, and streamline logistics in warehouses, hospitals, airports, and more.
  • Standard Bots is simulating and validating the performance of its R01 robot used in manufacturing and machining setup.
  • Swiss-Mile is using Isaac Sim and Isaac Lab for robot learning so that its wheeled quadruped robots can perform tasks autonomously with new levels of efficiency in factories and warehouses.
  • Cohesive Robotics has integrated Isaac Sim into its software framework called Argus OS for developing and deploying robotic workcells used in high-mix manufacturing environments.
  • Aescape’s robots are able to provide precision-tailored massages by accurately modeling and tuning the onboard sensors in Isaac Sim.

NVIDIA made other announcements in addition to the availability of Isaac Sim 4.2 on Amazon EC2 G6e Instances powered by NVIDIA L40S GPUs on AWS Marketplace.

It said that NVIDIA DGX Cloud can run on AWS for training AI models; that AWS liquid cooling is available for data centers using its Blackwell platform; and that NVIDIA BioNeMo NIM microservices and AI Blueprints, developed to advance drug discovery, are now integrated into AWS HealthOmics.

The company also said its latest AI Blueprints are available on AWS for video search and cybersecurity, the integration of NVIDIA CUDA-Q with Amazon Braket for quantum computing development, and RAPIDS Quick Start Notebooks on Amazon EMR.

The post AWS offers accelerated robotics simulation with NVIDIA appeared first on The Robot Report.

]]>
https://www.therobotreport.com/aws-offers-accelerated-robotics-simulation-nvidia/feed/ 0
Top 10 robotics developments of November 2024 https://www.therobotreport.com/november-2024-top-10-robotics-developments/ https://www.therobotreport.com/november-2024-top-10-robotics-developments/#respond Mon, 02 Dec 2024 19:15:55 +0000 https://www.therobotreport.com/?p=581806 In November 2024, stories about the future of robotics, big robot milestones, and new product unveilings grabbed our readers' attention.

The post Top 10 robotics developments of November 2024 appeared first on The Robot Report.

]]>
The start of the holiday season hasn’t slowed down the robotics industry. In November 2024, stories about the future of robotics, big robot milestones, and new product unveilings grabbed our reader’s attention. 

Here are the top 10 most popular stories on The Robot Report in the past month. Subscribe to The Robot Report Newsletter and listen to The Robot Report Podcast to stay up to date on the robotics developments you need to know about.


Robotic hand and human hand with map of Europe. In November 2024, European robotics hubs showed promise amid global competition.10. Europe has a key role to play in the development of robots, humanoids

While headlines often spotlight U.S. and Asian companies in the humanoid robotics race, startups in the tech hubs of Europe are making strides in developing human-like robots. From Norway to Switzerland, innovative European firms are pushing the boundaries of robotics technology, creating machines that can sense, feel, and interact with their environments in increasingly human-like ways. Read more.


A 'humanoid for hospitals,' Moxi has an arm for opening doors and operating elevators. It reached 100k elevator rides in November 2024.9. Moxi reaches milestone of 100,000 autonomous elevator rides in hospitals

As development continues on humanoid robots, one mobile robot is already at work in hospitals. Diligent Robotics announced that its Moxi robot has completed 110,000 autonomous elevator rides at health systems across the U.S. The mobile manipulator has a single arm for opening doors and pushing buttons to operate elevators. Read more.


AeroVironment's JUMP 20 uncrewed aircraft system.8. AeroVironment acquiring BlueHalo for $4.1B to boost defense tech

Defense contractor AeroVironment has agreed to acquire BlueHalo in an all-stock transaction worth approximately $4.1 billion. BlueHalo is best known for its drone swarm and counter-drone technology. The acquisition, which has been approved by both companies’ boards of directors, is expected to close in the first half of 2025. Read more.


Kassow has designed its Edge Edition cobot arms to work with mobile robot bases, as shown here. 7. Kassow Robots’ new cobots designed for mobile manipulation

Kassow Robots in November 2024 introduced a new line of compact collaborative robots designed to integrate with mobile robots. The new Edge Edition cobots are smaller robot arms designed for mobile manipulation applications. They feature a direct DC connection from battery power, enabling them to operate while mounted to a mobile robot. Read more.


close up of proxie's base.6. Collaborative Robotics unveils Proxie mobile manipulator

Collaborative Robotics Inc. unveiled its Proxie mobile manipulator publicly for the first time. The startup has been secretive about the design of the robot since Brad Porter founded the company in 2022. Porter has hinted at the design of the robot by alluding to the importance of a mobile manipulator for applications within the warehouse, with a kinematic that could be better suited for warehouse workflows than a humanoid. Read more.


Physical Intelligence demonstrates the application of foundation models to training robots for tasks such as folding laundry and assembling cardboard boxes.5. Physical Intelligence raises $400M for foundation models for robotics

Foundation models promise to give robots the ability to generalize actions from fewer examples than traditional artificial intelligence approaches. Physical Intelligence it has raised $400 million to continue its development of artificial intelligence for a range of robots. Read more.


The Digit humanoid carries totes at a Spanx warehouse in Georgia.4. Schaeffler plans global use of Agility Robotics’ Digit humanoid

Schaeffler AG, a global leader in motion technology, is making a minority investment into Agility Robotics and buying Digit humanoid robots for use across its global plant network. The companies did not disclose the size of the November 2024 investment, the number of humanoids being purchased, or what they will be used for. Read more.


Pickle Robot demonstrates lifting a 50-lb. box in a trailer.3. Pickle Robot gets orders for over 30 unloading systems plus $50M in funding

Robotic truck unloading fits the classic definition of dull, dirty, or dangerous jobs worth automating. Pickle Robot has raised $50 million in Series B funding and said that six customers placed orders during the third quarter for more than 30 robots to deploy in the first half of 2025. The new orders include pilot conversions, existing customer expansions, and new customer adoption. Read more.


The Southland Development Authority is reinvigorating manufacturing in Chicago's suburbs, shown here, through programs such as the Metals HUB.2. Chicago’s South Suburbs see the future of manufacturing as American and robotic

For decades, the Chicagoland area has played a pivotal role in American manufacturing capability. Unfortunately, the once-strong bastion of manufacturing and fabrication has lost much of its fervor following years of economic stagnation, outmigration, and a declining tax base. However, as the global marketplace continues to evolve, U.S. manufacturers must contend with an aging ownership base, greater competition, and a severe labor shortage. Read more.


A solder in camo and sunglasses looking into the camera and holding Red Cat's Black Widow drone. The company won an Air Force contract in November 2024.1. Red Cat wins U.S. Army next-gen drone contract over Skydio

Red Cat Holdings Inc. announced that it won the U.S. Army’s Short Range Reconnaissance, or SRR, program-of-record contract. The company replaced Skydio on this contract. The U.S. Army set an initial acquisition target of 5,880 systems over a five-year period. Read more.

The post Top 10 robotics developments of November 2024 appeared first on The Robot Report.

]]>
https://www.therobotreport.com/november-2024-top-10-robotics-developments/feed/ 0
Oxipital AI releases VX2 Vision System for inspection and picking https://www.therobotreport.com/oxipital-ai-releases-vx2-vision-system-for-inspection-and-picking/ https://www.therobotreport.com/oxipital-ai-releases-vx2-vision-system-for-inspection-and-picking/#respond Fri, 29 Nov 2024 13:05:04 +0000 https://www.therobotreport.com/?p=581791 Oxipital AI says its advanced vision system is more compact, delivers greater precision, and is more affordable than its predecessor.

The post Oxipital AI releases VX2 Vision System for inspection and picking appeared first on The Robot Report.

]]>
The VX2 Vision System uses AI for food-grade inspection, shown here, says Oxipital AI.

The VX2 Vision System uses AI for food-grade inspection and picking, says Oxipital AI.

Oxipital AI this month launched its VX2 Vision System, which uses artificial intelligence for inspection and high-speed picking applications across food-grade and industrial sectors. Built on the company’s proprietary Visual AI platform, the VX2 comes in a more compact package at a more accessible price than its predecessor.

“At Oxipital AI, we believe that listening to our customers and learning from real-world applications is the key to driving innovation,” said Austin Harvey, vice president of product at Oxipital. “The VX2 is the result of that philosophy in action. It’s smaller, more powerful, and more versatile, enabling our customers to build more resilient manufacturing processes.”

Formerly Soft Robotics, Oxipital is developing machine vision for product inspection and robotic process automation in critical industries such as food processing, agriculture, and consumer goods production.

The Bedford, Mass.-based company’s stated mission is “to deliver actionable insights through deep object understanding to customers as they embrace Industry 5.0 and unlock previously unachievable levels of resiliency, efficiency, and sustainability in their manufacturing operations.”

Oxipital AI recently launched its VX2 Vision System, which uses artificial intelligence for inspection and high-speed picking applications across food-grade and industrial sectors. Built on the company’s proprietary Visual AI platform, the VX2 comes in a more compact package at a more accessible price than its predecessor.

“At Oxipital AI, we believe that listening to our customers and learning from real-world applications is the key to driving innovation,” said Austin Harvey, vice president of product at Oxipital. “The VX2 is the result of that philosophy in action. It’s smaller, more powerful, and more versatile, enabling our customers to build more resilient manufacturing processes.”

The successor to Soft Robotics, Oxipital is developing machine vision for product inspection and robotic process automation in critical industries such as food processing, agriculture, and consumer goods production.

The Bedford, Mass.-based company’s stated mission is “to deliver actionable insights through deep object understanding to customers as they embrace Industry 5.0 and unlock previously unachievable levels of resiliency, efficiency, and sustainability in their manufacturing operations.”


SITE AD for the 2025 Robotics Summit registration. Register now


VX2 Vision System includes several enhancements

Oxipital AI said the VX2 Vision System represents a significant improvement over its first-generation vision platform. The company said it incorporated customer feedback and extensive field learning to meet the evolving needs of the industry.

The VX2 has enhanced capabilities for inspection, high-speed picking, and high-speed picking with inspection, said Oxipital. It asserted that the system ensures optimal efficiency and precision in a wide variety of environments and listed the following benefits:

Compact and powerful: The VX2 packs more processing power into a smaller, more efficient design, providing greater flexibility for installations in tight spaces or complex environments, said Oxipital.

Versatile application: Designed for food-grade and industrial use, the VX2 excels in inspection tasks, high-speed handling, and combining both, ensuring accuracy and speed in demanding workflows.

Enhanced Visual AI platform: Oxipital said its platform delivers faster, more accurate decision-making capabilities, ensuring high-performance, real-time operations.

Better price point: Despite significant improvements in power and versatility, the VX2 is available at a more competitive price, said the company. This makes it an attractive option for businesses seeking to upgrade their capabilities without incurring significant costs, it added.

Oxipital AI schematic of its vision technology. The VX2 Vision System continues the company's response to user feedback.
The VX2 Vision System continues Oxipital’s response to user feedback. Source: Oxipital AI

Oxipital AI applies vision to industry needs

With the VX2 launch at PACK EXPO this month, Oxipital said the technology demonstrates its commitment to innovations that address the challenges that industry is currently facing.

“Oxipital AI continues to push the boundaries of what is possible with vision systems in automated environments,” it said. Soft Robotics previously made compliant grippers before pivoting to vision AI.

Oxipital has partnered with Schmalz and Velec, and its was nominated as a PACK EXPO Food and Beverage Technology Excellence Award finalist.

The post Oxipital AI releases VX2 Vision System for inspection and picking appeared first on The Robot Report.

]]>
https://www.therobotreport.com/oxipital-ai-releases-vx2-vision-system-for-inspection-and-picking/feed/ 0
Nuro navigates to fully autonomous driving https://www.therobotreport.com/nuro-navigates-to-fully-autonomous-driving/ https://www.therobotreport.com/nuro-navigates-to-fully-autonomous-driving/#respond Wed, 27 Nov 2024 18:30:36 +0000 https://www.therobotreport.com/?p=581779 Andrew Clare, CTO of Nuro, discusses the current state of self-driving vehicles and software and the road to Level 5 autonomy.

The post Nuro navigates to fully autonomous driving appeared first on The Robot Report.

]]>

In Episode 174 of The Robot Report Podcast, we feature an interview with Andrew Clare, chief technology officer of Nuro Inc. It’s a short workweek in the U.S. with the Thanksgiving holiday, so we skipped the news this week and went straight into the interview with Clare.

He discusses the company‘s evolution in the autonomous vehicle space, focusing on its Nuro Driver technology. Clare elaborates on Nuro’s expansion of its business model to include partnerships with automotive OEMs and the potential market for AI-based driving.

Clare also highlights the challenges of urban versus highway driving, the importance of safety culture, and the technology stack required for autonomous vehicles. We also touch on the differences between SAE Level 4 and Level 5 autonomy, as well as the future direction of Nuro in integrating hardware and software.

Show timeline

  • 8:12 – Interview with Andrew Clare

SITE AD for the 2025 Robotics Summit registration. Register now


News of the week

We’ll discuss the latest news after the Thanksgiving holiday.


2025 RBR50 Robotics Innovation Awards open for nominations

You can now submit nominations for the 2025 RBR50 innovation awards. They will recognize technology and business innovations in the calendar year 2024, and the awards are open to any company worldwide that produces robotics or automation.

The categories include:

  1. Technologies, products, and services: This category includes primary or applied research focusing on robotics and supporting technologies such as motion control, vision, or machine learning. It also includes new products and business, engineering, or technology services.
  2. Business and management: This category covers initiatives positioning a company as a market leader or an organization as an important thought leader in the robotics ecosystem. Significant mergers and acquisitions are relevant, as are supplier, partner, and integrator relationships.
  3. Applications and markets: The RBR50 will also recognize innovations that improve productivity, quality, and cost-effectiveness, as well as those that automate new tasks.

In addition, the 2025 RBR50 awards will celebrate the following:

  • Startup of the Year
  • Application of the Year
  • Robot of the Year
  • Robots for Good Award

The deadline for submissions is Friday, Dec. 20, 2024.


Podcast sponsored by RGo Robotics

The show this week is sponsored by RGo Robotics Inc.

Is your autonomous mobile robot (AMR) struggling in dynamic environments? Is your business stuck because it takes months to commission a new site?

RGo Robotics’ Perception Engine is revolutionizing the AMR business through advanced Vision AI perception technology. Unlike traditional solutions, The company’s software enables AMRs to adapt to changing environments and navigate complex spaces with unprecedented accuracy and the commissioning process is shorter and simpler.

Leading AMR companies are enhancing their fleets with RGo’s AI-powered perception, enabling their teams to accelerate use of advanced AI capabilities like foundation models and digital twins.

Don’t let outdated navigation hold your business back.

To learn more about RGo’s solutions, go to: https://www.rgorobotics.ai/


 

The post Nuro navigates to fully autonomous driving appeared first on The Robot Report.

]]>
https://www.therobotreport.com/nuro-navigates-to-fully-autonomous-driving/feed/ 0
Learn about digitalization in the warehouse in new webinar https://www.therobotreport.com/learn-about-digitalization-in-the-warehouse-in-webinar/ https://www.therobotreport.com/learn-about-digitalization-in-the-warehouse-in-webinar/#comments Wed, 27 Nov 2024 14:30:49 +0000 https://www.therobotreport.com/?p=581774 Digitalization of the warehouse involves several emerging technologies; attendees of this free webinar can learn from industry experts.

The post Learn about digitalization in the warehouse in new webinar appeared first on The Robot Report.

]]>
Digital tools such as the simulation shown here from Dexory, are part of digitalization in the warehouse.

Digitalization is bringing emerging technologies into the warehouse. Source: Dexory

Designing and deploying a digital warehouse can be a challenge, with numerous technology options to add to your operations. From robotics and automation to the latest data analytics and artificial intelligence, how can you take advantage of digitalization?

At 2:00 p.m. EST on Wednesday, Dec. 4, expert panelists will discuss how emerging technologies are changing how engineers design warehouse systems and how businesses can gain insights and efficiencies with them. Sensors, digital twins, wearables, and virtual assistants are some of the tools that are part of this digital transformation.

In this free webinar, viewers can learn about:

  • Ways to improve labor productivity with workforce management
  • The orchestration of people and autonomous mobile robots (AMRs) for order picking and fulfillment
  • Where augmented and virtual reality (AR/VR) fit in the warehouse
  • How AI will change how operators use data in positive feedback cycle
  • How to scale digital transformation across facilities and the supply chain

Register now to attend this webinar on digitalization, and have your questions answered live. Registrants will be able to view it on demand after the broadcast date.

Digitalization speakers to share insights

Robert C. Kennedy, principal at RC Kennedy Consulting, will discuss digitalization in the warehouse.

Robert C. Kennedy is principal at RC Kennedy Consulting. For over four decades, he has planned, developed, and implemented industry-leading supply chain execution systems around the globe. Kennedy and his staff have led more than 200 large-scale implementation projects of supply chain execution software for leading customers in a variety of industries, including pharmaceutical, electronics, third-party logistics (3PL), and food and beverage.

As a leading voice of expertise, Bob is featured in regular interviews by industry media and has published articles, and he has presented at numerous trade shows and seminars.

RC Kennedy Consulting provides assistance to companies to improve operational efficiencies through process design and systems. It also helps them develop strategies for growth.

Ken Ramoutar will discuss digitalization in the warehouse.

Ken Ramoutar is chief marketing officer at Lucas Systems, which helps companies transform their distribution center by dramatically increasing worker productivity, operational agility, and customer and worker satisfaction using voice and AI optimization technologies.

In his 25 years of customer centric roles in supply chain software and consulting, Ramoutar has navigated companies through uncertainty and volatility as a thought leader and change agent.

Prior to Lucas, Ken was senior vice president and global head of customer experience at Avanade, a $3 billion Accenture and Microsoft-owned company, and he has held leadership roles at IBM, Sterling Commerce, and SAP/Ariba.

Michael Taylor is chief product officer and co-founder of Duality AI.

Michael Taylor is the chief product officer and co-founder of Duality AI. He has a 20-year career in mobile robotics, with 15 years dedicated to building autonomous field robots at Caterpillar.

While there, Mike led the team developing the autonomy system for Caterpillar’s autonomous dozer, and he helped launch the Autonomous Mining Truck program. His roles included architecting behaviors and planning systems, as well as building a collection of simulation technologies to accelerate deployment to customer sites.

Taylor was also part of the Carnegie Mellon team that won DARPA’s Urban Challenge, where he led both the Controls Team and the Field Calibration Team. Taylor holds dozens of patents in fields ranging from robotics to simulation technologies.

At Duality AI, Taylor leads the company’s Product and Solutions Engineering team. He is responsible for steering Duality’s product strategy, developing technologies to address customer needs, and helping ensure that customers maximize the value they extract from Falcon. This includes projects ranging from a simulation solution to support a drone-based AI perception system, to generating synthetic data for high-volume manufacturing quality assurance, to characterizing and modeling of uncrewed ground vehicles (UGVs) navigating novel environments. 

Eugene Demaitre, editorial director for robotics at WTWH Media

Eugene Demaitre, moderator, is the editorial director for robotics at WTWH Media, which produces Automated WarehouseThe Robot Report, the Robotics Summit & Expo, and RoboBusiness. Prior to working for WTWH Media, he was an editor at BNA (now part of Bloomberg), Computerworld, TechTarget, Robotics Business Review, and Robotics 24/7.

Demaitre has participated in conferences worldwide, as well as spoken on numerous webcasts and podcasts. He is always interested in learning more about robotics. He has a master’s from the George Washington University and lives in the Boston area.

This webinar is sponsored by Baluff and Dexory.

Balluff logo
Dexory logo

The post Learn about digitalization in the warehouse in new webinar appeared first on The Robot Report.

]]>
https://www.therobotreport.com/learn-about-digitalization-in-the-warehouse-in-webinar/feed/ 1
Imagry moves to make buses autonomous without mapping https://www.therobotreport.com/imagry-moves-to-make-buses-autonomous-without-mapping/ https://www.therobotreport.com/imagry-moves-to-make-buses-autonomous-without-mapping/#respond Mon, 25 Nov 2024 19:18:36 +0000 https://www.therobotreport.com/?p=581732 Imagry has developed hardware-agnostic systems to provide Level 4 autonomy to buses with time to market in mind.

The post Imagry moves to make buses autonomous without mapping appeared first on The Robot Report.

]]>
Imagry says its autonomy kit enables buses to autonomously handle roundabouts, as shown here.

Imagry says its software enables buses to autonomously handle complex situations such as roundabouts. Source: Imagry

Autonomous vehicles often rely heavily on prior information about their routes, but new technology promises to improve real-time situational awareness for vehicles including buses. Imagry said its “HD-mapless driving” software stack enables vehicles to react to dynamic contexts and situations more like human drivers.

The company also said its AI Vision 360 eliminates the need for external sensor infrastructure. It claimed that its bio-inspired neural network and hardware-agnostic systems allow for SAE Level 3/4 operations without spending time on mapping.

“We’ve been focusing on two sectors,” said Eran Ofir, CEO of Imagry. “We’ve been selling our perception and motion-planning stack to Tier 1 suppliers and automotive OEMs for autonomous vehicles. We signed a 10-year contract with Continental and are jointly developing a software-defined vehicle platform.”

“And we’ve started working with transportation operators on providing autonomous buses,” he told The Robot Report. “For example, in Turkey, France, Spain, and soon Japan, we’re retrofitting electric buses to be autonomous.”


SITE AD for the 2025 Robotics Summit registration. Register now


Imagry trains in real time with supervision

Imagry was established in 2015 with a focus on computer vision for retail. In 2018, it began focusing entirely on autonomous driving. The company now has about 120 employees in San Jose, Calif., and Haifa, Israel.

Imagry said its technology is similar to that of Tesla in relying on 3D vision for perception and motion planning rather than rule-based coding or maps.

“Most players in the industry use HD maps with 5 cm [1.9 in.] resolution, telling the vehicle where lights, signs, and lane markers are,” said Ofir. “Our system teaches itself with supervised learning. It maps in real time while driving. Like a human driver, it gets the route but doesn’t know what it will find.”

How does Imagry deal with the need for massive data sets to train for navigation and obstacle detection and avoidance?

“We wrote a proprietary tool for annotation to train faster, better, and cheaper,” Ofir replied. “The data is collected but doesn’t live in the cloud. The human supervisor tells the vehicle where it was wrong, like a child. We deliver over-the-air updates to customers.”

“The world doesn’t belong to HD maps — it’s a matter of trusting AI-based software for perception and motion planning,” he said.

Ofir cited an example of a vehicle in Arizona on a random route with no communications to centralized computing. Its onboard sensors and compute recognized construction zones, skateboarders, a bike lane, and stop signs.

“The capability to drive out of the box in new places is unique to Imagry,” asserted Ofir. “We can handle righthand and lefthand driving, such as in Tokyo, where we’ve been driving for a year now.”

How does the bus know when to stop for passengers?

It could stop at every bus stop, upon request via a button at the stop (for the elderly, who may not use phone apps), or be summoned by an app that also handles payment, responded Ofir. Imagry’s system also supports “kneeling” for people with disabilities.

Why buses are a better focus for autonomy

Imagry has decided to focus on urban use cases rather than highways. Buses are simpler to get to Level 4 autonomy, said Ofir.

“Autonomous buses are better than ride hailing; they’re simpler than passenger vehicles,” said Ofir. “They drive in specific routes and at a speed of only 50 kph [31 mph] versus 80 kph [50 mph]. It’s a simpler use case, with economies of scale.”

“The time to revenue is much faster — the design cycle is four years, while integrating with a bus takes two to three months,” he explained. “Once we hand it over to the transport operator, we can get to L4 in 18 months, and then they can buy and deploy 40 more buses.”

In addition, the regulations for autonomous buses are clearer, with 22 countries running pilots, he noted.

“We already have projects with a large medical center and on a public road in Israel,” Ofir said. “We’re not doing small pods — most transport operators desire M3-class standard buses for 30 to 45 passengers because of the total cost of ownership, and they know how to operate them.”

In September and October, Imagry submitted bids for autonomous buses in Austria, Portugal, Germany, Sweden, and Japan.

Software focus could save money

By being vehicle-agnostic, Ofir said Imagry avoids being tied to specific, expensive hardware. Fifteen vendors are making systems on chips (SoCs) that are sufficient for Level 3 autonomy, he said.

“OEMs want the agility to use different sets of hardware in different vehicles. A $30,000 car is different from a $60,000 car, with different hardware stacks and bills of materials, such as camera or compute,” said Ofir. “It’s a crowded market, and the autonomy stack still costs $100,000 per vehicle. Ours is only $3,000 and runs on Ambarella, NVIDIA, TI, Qualcomm, and Intel.”

“With our first commercial proof of concept for Continental in Frankfurt, Germany, we calibrated our car and did some localization,” he added. “Three days after arrival, we simply took it out on the road, and it drove, knowing there’s no right on red.”

With shortages of drivers, particularly in Japan, operators could save $40,000 to $70,000 per bus per year, he said. The Japanese government wants 50 locations across the country to be served with autonomous buses by the end of 2025 and 100 by the end of 2027.

Autonomous buses are also reliable around the clock and don’t get sick or go on strike, he said.

“We’re working on fully autonomous parking, traffic jam assist, and Safe Driver Overwatch to help younger or older drivers obey traffic signs, which could be a game-changer in the insurance industry,” he added. “Our buses can handle roundabouts, narrow streets, and mixed traffic and are location-independent.”

Phases of autonomous bus deployment

Technology hurdles aside, getting autonomous buses recognized by the rules of the road requires patience, said Ofir.

“Together with Mobileye, which later moved to the robotaxi market, Imagry helped draft Israel’s regulatory framework for autonomous driving, which was completed in 2022,” recalled Ofir. “We’re working with lawmakers in France and Germany and will launch pilots in three markets in 2025.”

Testing even Level 3 autonomy can take years, depending on the region. He outlined the phases for autonomous bus rollout:

  1. Work with the electric bus for that market, then activate the system on a public road. “In the U.S., we’ve installed the full software and control stack in a vehicle and are testing FSD [full self-driving],” Ofir said.
  2. Pass NCAP (European New Car Assessment Programme) testing for merging and stops in 99 scenarios. “We’re the only company to date to pass those tests with an autonomous bus,” said Ofir. “Japan also has stringent safety standards.”
  3. Pass the cybersecurity framework, then allow passengers onboard buses with a safety driver present.
  4. Autonomously drive 100,000 km (62,137 mi.) on a designated route with one or more buses. After submitting a report to a department of motor vehicles or the equivalent, the bus operator could then remove the human driver.

“The silicon, sensors, and software don’t matter for time to revenue, and getting approvals from the U.S. National Highway Traffic Safety Administration [NHTSA] can take years,” Ofir said. “We expect passenger vehicles with our software on the road in Europe, the U.S., and Japan sometime in 2027.”

Imagry has joined Partners for Automated Vehicle Education (PAVE) and will be exhibiting at CES in January 2025.

The post Imagry moves to make buses autonomous without mapping appeared first on The Robot Report.

]]>
https://www.therobotreport.com/imagry-moves-to-make-buses-autonomous-without-mapping/feed/ 0
Smart Vision Works introduces SiftAI robotic potato sorter https://www.therobotreport.com/smart-vision-works-introduces-siftai-robotic-potato-sorter/ https://www.therobotreport.com/smart-vision-works-introduces-siftai-robotic-potato-sorter/#respond Sat, 23 Nov 2024 13:30:21 +0000 https://www.therobotreport.com/?p=581648 Smart Vision Works said its SiftAI robotic potato sorter will pay for itself in fewer than two years from installation.

The post Smart Vision Works introduces SiftAI robotic potato sorter appeared first on The Robot Report.

]]>
Two SiftAI Robotic Sorters, delta robots, sorting potatoes as they go down a conveyor belt.

Smart Vision Works’ SiftAI vision system uses AI to sort potato defects and sizes with high accuracy. | Source: Smart Vision Works

Fresh-pack potato processors struggle to find workers for the final inspection of potato sorting and grading. Smart Vision Works last week announced the SiftAI Robotic Sorter, which combines a delta robot with an AI-based vision inspection system to sort potatoes.

Even when potato-sorting sheds can adequately staff, defects still reach customers, and acceptable potatoes are wasted, it said. The Westborough, Mass.-based company said its robotic sorter can automate final inspection, ensuring accurate grading, increasing profits, and allowing managers to redeploy scarce workers to other tasks.

“Because of potato oversupply and rising wages in North America, many potato processors are losing money on every box shipped,” said Curtis Koelling, vice president of product development and innovation at Smart Vision Works.

“Managers are eager to identify technology that can lower their production costs,” he said. “When they see a competitor managing final inspection without labor costs, they become very interested in the technology.”

SiftAI inspects potatoes for defects

Founded in 2012, Smart Vision Works creates AI and machine learning algorithms to reduce the number of images needed to train models. It can then take on challenging machine vision problems and to deliver high-quality solutions for its customers.

KPM Analytics, a global developer of scientific instrumentation, acquired the Orem, Utah-based company in 2023.

The new product includes a vision-based system, AI software, and a proven potato-inspection model including 19 different defects. Installed over a roller table, SiftAI uses its cameras to collect images of all sides of the potato.

Each system is programmed with AI models for overall potato size and shape or the presence of defects like bruises, cracks, percent green, and other cosmetic features.

Smart Vision Works develops rapid sortation

For any potatoes that grade outside the AI model’s acceptance criteria, SiftAI triggers the robotic arm to pick up and remove the potato from the product stream at rates of 80 to 100 picks per minute with two-robot system configurations.

The SiftAI Robotic Sorter inspects potatoes with the same dexterity and speed as a human inspector but with much higher accuracy, increasing profitability and reducing customer chargebacks, claimed Smart Vision Works. Currently, the industry goal is to have no more than 5% of defective potatoes reaching customers, which is the limit set by the U.S. Department of Agriculture.

Human inspectors typically discard 10% to 20% of acceptable potatoes, reducing profits. In beta testing, the new AI-enabled robotic sorter dramatically reduced the percentage of missed defects and misgraded potatoes. Adding increased profitability to the labor savings, Smart Vision Works said the financial impact of this system is significant.

The investment pays for itself in fewer than two years, said the company. It asserted that the system’s high accuracy is possible because its technology is not like the basic AI commonly used by other vision inspection systems.

Instead, SiftAI is built on 12 years of development by AI scientists and years of experience in the potato industry. Unlike systems that use optical scanners, the system takes a full digital image and runs it through a neural network, said Smart Vision Works.

The SIftAI Robotic Sorter is available for order now.


SITE AD for the 2025 Robotics Summit registration. Register now


The post Smart Vision Works introduces SiftAI robotic potato sorter appeared first on The Robot Report.

]]>
https://www.therobotreport.com/smart-vision-works-introduces-siftai-robotic-potato-sorter/feed/ 0
Duality AI offers developers EDU license for Falcon digital twins, synthetic data https://www.therobotreport.com/duality-ai-offers-developers-edu-license-for-falcon-digital-twins-synthetic-data/ https://www.therobotreport.com/duality-ai-offers-developers-edu-license-for-falcon-digital-twins-synthetic-data/#respond Thu, 21 Nov 2024 13:54:52 +0000 https://www.therobotreport.com/?p=581670 The EDU program offers subscribers full access to Falcon’s comprehensive feature set, alongside community resources developed by Duality AI.

The post Duality AI offers developers EDU license for Falcon digital twins, synthetic data appeared first on The Robot Report.

]]>
Scenarios in Duality AI's Falcon Editor, including an electrical tower, an automated guided vehicle, an autonomous mobile robot, and a humanoid robot.

The Falcon digital twin platform provides high-fidelity, domain-tailored simulation for a variety of use cases. | Source: Duality AI

Duality AI yesterday launched an EDU license and subscription for its Falcon simulation platform. The company said it designed this new program to equip aspiring artificial intelligence developers with the synthetic data skills needed to create advanced AI vision models.

This educational, non-commercial license is intended to expand access to digital twin simulation, said Duality. The San Mateo, Calif.-based company said it will enable students and developers to build cutting-edge AI models and meet the growing demand for AI professionals across industries.

“Digital twin simulation has unlocked a future where anyone can build AI models safely, rapidly, and affordably,” said Mike Taylor co-founder and chief product officer of Duality AI. “Now is the perfect time to invest in building a community that can harness these tools.”

“Whether learners come from an engineering, research, or creative background, we’re excited to share our expertise and help them discover how their skills can play a vital role in the evolving AI industry,” he stated.

Falcon generates accurate data for modeling, training

Founded in 2018, Duality AI said its multidisciplinary team includes engineers, simulation specialists, AI and machine learning experts, and technical artists. They have more than over 70 patents across robotics, simulation, and visualization.

The company specializes in cases where real-world data is insufficient for achieving the precision required for AI modeling and training of complex operations. Duality said it has developed proven techniques that drive successful outcomes for its customers. 

By bringing high-fidelity digital twins of environments and operating systems into Falcon, organizations can generate accurate data and predictive behavior modeling, said Duality AI. This enables them to deploy automated systems robustly and at scale, the company claimed.

Organizations are using the Falcon platform to help solve problems in AI, robotics, and smart system engineering, said the company. Their applications span off-road autonomous driving, high-volume manufacturing, warehouse automation, and disaster management.

Duality AI told The Robot Report that it is taking a similar approach with the EDU license to its work with NASA’s Jet Propulsion Laboratory on the DARPA RACER, enabling students to generate synthetic data for outdoor environments and train and test AI models for autonomous off-road vehicles.

Duality AI to extend its expertise to students

As the need for accurate AI vision models continues to grow, so does the need for skills in digital twin simulation and synthetic data generation, said Duality AI.

“There is currently a lack of some key skills — such as creating digital twins or best-practice techniques for getting the most out of synthetic data — that are not that difficult to learn, but make a huge difference,” said a Duality AI spokesman. “We’re helping close that gap.”

The EDU program offers subscribers full access to Falcon’s feature set. It also includes guided exercises and community resources developed by Duality AI’s experts.

“As an example: In Exercise 1 of the program, we are showing roboticists another way to develop the object-detection models that run on their systems,” the spokesman said. “In fact, it’s a method that many in our field don’t think is possible. We went to show them that not only is it possible, but [also] that we can teach them how to bring these skills into their own development patterns.”

To further support all learners, Duality is launching an online community where anyone can ask questions, collaborate on projects, and share their work.

The company said the curriculum itself is designed to build a strong foundation in digital twin and synthetic data workflows, equipping participants with the skills to create high-performance AI vision models independently.

“Falcon is the platform I wish I had as a graduate student,” said Dr. Felipe Mejia, an AI vision engineer at Duality. “I was always searching for datasets to test new algorithms, and working with digital twins in Falcon offers endless opportunities to experiment and explore.”

“It allows me to simulate scenarios not well-covered by real data, and easily investigate model failure modes — like how does object detection success rate change based on obstruction, distance, lighting? Or any other variable,” he noted.

Duality AI added that its EDU subscription is intended to inspire innovation, and it encouraged users to experiment, develop their projects, and apply their learnings across a variety of fields. The company said it “hopes to foster a vibrant community of innovators eager to explore the full potential of synthetic data and digital twin simulation in modern AI applications.”


SITE AD for the 2025 Robotics Summit registration. Register now


The post Duality AI offers developers EDU license for Falcon digital twins, synthetic data appeared first on The Robot Report.

]]>
https://www.therobotreport.com/duality-ai-offers-developers-edu-license-for-falcon-digital-twins-synthetic-data/feed/ 0
Nuro Driver expands Level 4 autonomous fleet in California and Texas https://www.therobotreport.com/nuro-driver-expands-level-4-autonomous-deliveries-california-texas/ https://www.therobotreport.com/nuro-driver-expands-level-4-autonomous-deliveries-california-texas/#respond Tue, 19 Nov 2024 14:00:20 +0000 https://www.therobotreport.com/?p=581631 With this expanded deployment of zero-occupant vehicles, the company said Nuro Driver is ready to autonomously transport people and goods.

The post Nuro Driver expands Level 4 autonomous fleet in California and Texas appeared first on The Robot Report.

]]>
A small, boxy, white Nuro vehicle driving on a road with a glass building behind it.

Nuro’s custom L4 vehicles use the Nuro Driver to safely carry food and drink, with no human present in the vehicle. | Source: Nuro

Nuro Inc. today announced a significant expansion of its driverless capabilities using zero-occupant vehicles with the artificial intelligence-powered Nuro Driver system. The company said this expansion covers multiple cities in two states and includes significant operational advancements.

The expanded deployment of autonomous vehicles demonstrates foundational technology for transporting people and goods, asserted Nuro. It plans to expand in Mountain View and Palo Alto, Calif., where the company increased its deployment area by 83%. Nuro also plans to increase its deployment area in Houston by 70%, in terms of linear miles. 

In September, Nuro expanded its business model to include licensing Nuro Driver to automotive OEMs. As part of the new licensing model, the company also announced the Nuro AI Platform, which consists of scalable and performant developer tools to support AI development and validation for the Nuro Driver.

“Since publicly unveiling our new direction a little over a month ago, we have seen tremendous interest in our AI-driven autonomy platform from automotive OEMs and mobility companies,” stated Jiajun Zhu, the co-founder and CEO of Nuro. “Our latest driverless deployment demonstrates the maturity and capability of our AI platform, and we’re excited for potential partners to capitalize on the performance, safety, and sophistication of the Nuro Driver to build their own incredible autonomy products.”


SITE AD for the 2025 Robotics Summit registration. Register now


Nuro Driver ready to take on new challenges

Founded in 2016, Nuro said its newly expanded operational design domain (ODD) encompasses advances including:

  • Multi-lane road operation at speeds up to 35 mph (56.3 kph)
  • Improvements related to complex scenario handling, such as reacting to active emergency vehicles, navigating construction zones, and responding to active school busesa
  • Night operation, expanding service availability

Nuro said its system now covers a wider portion of everyday driving conditions. The Mountain View-based company said this expanded operational scope demonstrates the growing sophistication and reliability of its autonomous vehicles in real-world applications.

To date, Nuro said its fleet has logged more than 1 million autonomous miles with zero at-fault incidents, underscoring the company’s commitment to safety and technological excellence. Its custom L4 vehicle is designed with cost-effective, automotive-grade components.

Nuro claimed that its approach ensures that its technology is not only highly capable but also practical for large-scale deployment across various vehicle types and use cases. The company said Nuro Driver can accelerate autonomous vehicle development by enabling up to SAE Level 4 autonomy on mobility platforms and personally-owned vehicles.

The post Nuro Driver expands Level 4 autonomous fleet in California and Texas appeared first on The Robot Report.

]]>
https://www.therobotreport.com/nuro-driver-expands-level-4-autonomous-deliveries-california-texas/feed/ 0
MIT: LucidSim training system helps robots close Sim2Real gap https://www.therobotreport.com/mit-lucidsim-training-system-helps-robots-close-sim2real-gap/ https://www.therobotreport.com/mit-lucidsim-training-system-helps-robots-close-sim2real-gap/#respond Sun, 17 Nov 2024 15:00:17 +0000 https://www.therobotreport.com/?p=581620 LucidSim uses generative AI and physics simulators to create realistic virtual training environments that help robots learns tasks without any real-world data.

The post MIT: LucidSim training system helps robots close Sim2Real gap appeared first on The Robot Report.

]]>

For roboticists, one challenge towers above all others: generalization – the ability to create machines that can adapt to any environment or condition. Since the 1970s, the field has evolved from writing sophisticated programs to using deep learning, teaching robots to learn directly from human behavior. But a critical bottleneck remains: data quality. To improve, robots need to encounter scenarios that push the boundaries of their capabilities, operating at the edge of their mastery. This process traditionally requires human oversight, with operators carefully challenging robots to expand their abilities. As robots become more sophisticated, this hands-on approach hits a scaling problem: the demand for high-quality training data far outpaces humans’ ability to provide it.

A team of MIT CSAIL researchers have developed an approach to robot training that could significantly accelerate the deployment of adaptable, intelligent machines in real-world environments. The new system, called “LucidSim,” uses recent advances in generative AI and physics simulators to create diverse and realistic virtual training environments, helping robots achieve expert-level performance in difficult tasks without any real-world data.

LucidSim combines physics simulation with generative AI models, addressing one of the most persistent challenges in robotics: transferring skills learned in simulation to the real world.

“A fundamental challenge in robot learning has long been the ‘sim-to-real gap’ – the disparity between simulated training environments and the complex, unpredictable real world,” said MIT CSAIL postdoctoral associate Ge Yang, a lead researcher on LucidSim. “Previous approaches often relied on depth sensors, which simplified the problem but missed crucial real-world complexities.”

The multi-pronged system is a blend of different technologies. At its core, LucidSim uses large language models to generate various structured descriptions of environments. These descriptions are then transformed into images using generative models. To ensure that these images reflect real-world physics, an underlying physics simulator is used to guide the generation process.

Related: How Agility Robotics closed the Sim2Real gap for Digit

Birth of an idea: from burritos to breakthroughs

The inspiration for LucidSim came from an unexpected place: a conversation outside Beantown Taqueria in Cambridge, MA.

​​”We wanted to teach vision-equipped robots how to improve using human feedback. But then, we realized we didn’t have a pure vision-based policy to begin with,” said Alan Yu, an undergraduate student at MIT and co-lead on LucidSim. “We kept talking about it as we walked down the street, and then we stopped outside the taqueria for about half an hour. That’s where we had our moment.”


SITE AD for the 2025 Robotics Summit registration. Register now


To cook up their data, the team generated realistic images by extracting depth maps, which provide geometric information, and semantic masks, which label different parts of an image, from the simulated scene. They quickly realized, however, that with tight control on the composition of the image content, the model would produce similar images that weren’t different from each other using the same prompt. So, they devised a way to source diverse text prompts from ChatGPT.

This approach, however, only resulted in a single image. To make short, coherent videos which serve as little “experiences” for the robot, the scientists hacked together some image magic into another novel technique the team created, called “Dreams In Motion (DIM).” The system computes the movements of each pixel between frames, to warp a single generated image into a short, multi-frame video. Dreams In Motion does this by considering the 3D geometry of the scene and the relative changes in the robot’s perspective.

“We outperform domain randomization, a method developed in 2017 that applies random colors and patterns to objects in the environment, which is still considered the go-to method these days,” says Yu. “While this technique generates diverse data, it lacks realism. LucidSim addresses both diversity and realism problems. It’s exciting that even without seeing the real world during training, the robot can recognize and navigate obstacles in real environments.”

The team is particularly excited about the potential of applying LucidSim to domains outside quadruped locomotion and parkour, their main testbed. One example is mobile manipulation, where a mobile robot is tasked to handle objects in an open area, and also, color perception is critical.

“Today, these robots still learn from real-world demonstrations,” said Yang. “Although collecting demonstrations is easy, scaling a real-world robot teleoperation setup to thousands of skills is challenging because a human has to physically set up each scene. We hope to make this easier, thus qualitatively more scalable, by moving data collection into a virtual environment.”

a quadruped robot learned to navigate new environments using generative ai.

MIT researchers used a Unitree Robotics Go1 quadruped. | Credit: MIT CSAIL

The team put LucidSim to the test against an alternative, where an expert teacher demonstrates the skill for the robot to learn from. The results were surprising: robots trained by the expert struggled, succeeding only 15 percent of the time – and even quadrupling the amount of expert training data barely moved the needle. But when robots collected their own training data through LucidSim, the story changed dramatically. Just doubling the dataset size catapulted success rates to 88%.

“And giving our robot more data monotonically improves its performance – eventually, the student becomes the expert,” said Yang.

“One of the main challenges in sim-to-real transfer for robotics is achieving visual realism in simulated environments,” said Stanford University assistant professor of Electrical Engineering Shuran Song, who wasn’t involved in the research. “The LucidSim framework provides an elegant solution by using generative models to create diverse, highly realistic visual data for any simulation. This work could significantly accelerate the deployment of robots trained in virtual environments to real-world tasks.”

From the streets of Cambridge to the cutting edge of robotics research, LucidSim is paving the way toward a new generation of intelligent, adaptable machines – ones that learn to navigate our complex world without ever setting foot in it.

Yu and Yang wrote the paper with four fellow CSAIL affiliates: mechanical engineering postdoc Ran Choi; undergraduate researcher Yajvan Ravan; John Leonard, Samuel C. Collins Professor of Mechanical and Ocean Engineering in the MIT Department of Mechanical Engineering; and MIT Associate Professor Phillip Isola.

Editor’s Note: This article was republished from MIT CSAIL

The post MIT: LucidSim training system helps robots close Sim2Real gap appeared first on The Robot Report.

]]>
https://www.therobotreport.com/mit-lucidsim-training-system-helps-robots-close-sim2real-gap/feed/ 0
The AI Institute introduces Theia vision foundation model to improve robot learning https://www.therobotreport.com/theia-vision-foundation-model-aiinstitute-generates-improve-robot-learning/ https://www.therobotreport.com/theia-vision-foundation-model-aiinstitute-generates-improve-robot-learning/#respond Wed, 13 Nov 2024 20:02:38 +0000 https://www.therobotreport.com/?p=581579 Theia is a visual foundation model that the AI Institute says can distill diverse models for policy learning at a lower computation cost.

The post The AI Institute introduces Theia vision foundation model to improve robot learning appeared first on The Robot Report.

]]>
 

In the field of robotics, vision-based learning systems are a promising strategy for enabling machines to interpret and interact with their environment, said the AI Institute today. It introduced the Theia vision foundation model to facilitate robot training.

Vision-based learning systems must provide robust representations of the world, allowing robots to understand and respond to their surroundings, said the AI Institute. Traditional approaches typically focus on single-task models—such as classification, segmentation, or object detection—which individually do not encapsulate the diverse understanding of a scene required for robot learning.

This shortcoming highlights the need for a more holistic solution capable of interpreting a broad spectrum of visual cues efficiently, said the Cambridge, Mass.-based institute, which is developing Theia to address this gap.

In a paper published in the Conference on Robot Learning (CoRL), the AI Institute introduced Theia, a model that is designed to distill the expertise of multiple off-the-shelf vision foundation models (VFMs) into a single model. By combining the strengths of multiple different VFMs, each trained for a specific visual task, Theia generates a richer, unified visual representation that can be used to improve robot learning performance.

Robot policies trained using Theia’s encoder achieved a higher average task success rate of 80.97% when evaluated against 12 robot simulation tasks, a statistically significant improvement over other representation choices.

Furthermore, in real robot experiments, where the institute used behavior cloning to learn robot policies across four multi-step tasks, the trained policy success rate using Theia was on average 15 percentage points higher than policies trained using the next-best representation.

The AI Institute plots robot control policies trained with Theia outperform policies trained with alternative representations on MuJoCo robot simulation tasks, with much less computation, measured by the number of Multiply-Accumulate operations in billions.

Robot control policies trained with Theia outperform policies trained with alternative representations on MuJoCo robot simulation tasks, with much less computation, measured by the number of Multiply-Accumulate operations in billions (MACs). Source: The AI Institute

Theia designed to combine visual models

Theia’s design is based on a distillation process that integrates the strengths of multiple VFMs such as CLIP (vision language), DINOv2 (dense visual correspondence), and ViT (classification), among others. By carefully selecting and combining these models, Theia is able to produce robust visual representations that can improve downstream robot learning performance, said the AI Institute.

At its core, Theia consists of a visual encoder (backbone) and a set of feature translators, which work in tandem to incorporate the knowledge from multiple VFMs into a unified model. The visual encoder generates latent representations that capture diverse visual insights.

These representations are then processed by the feature translators, which refine them by comparing the output features against ground truth. This comparison serves as a supervisory signal, optimizing Theia’s latent representations to enhance their diversity and accuracy.

These optimized latent representations are subsequently used to fine-tune policy learning models, enabling robots to perform a wide range of tasks with greater accuracy.

Theia's design is based on a process that distills the strengths of multiple VFMs, including CLIP, SAM, DINOv2, Depth-Anything, and ViT, among others, according to the AI Institute.

Theia’s design is based on a process that distills the strengths of multiple VFMs, including CLIP, SAM, DINOv2, Depth-Anything, and ViT, among others. Source: The AI Institute

Robots learn in the lab

Researchers at the AI Institute tested Theia in simulation and on a number of robot platforms, including Boston Dynamics‘ Spot and a WidowX robot arm. For one of the rounds of lab testing, it used Theia to train a policy enabling a robot to open a small microwave, place toy food inside, and close the microwave door.

Previously, researchers would have needed to combine all the VFMs, which is slow and computationally expensive, or select which VFM to use to represent the scene in front of the robot. For example, they could choose a segmentation image from a segmentation model, a depth image from a depth model, or a text class name from an image classification model. Each provided different types and granularity of information about the scene.

Generally, a single VFM might work well for a single task with known objects but might not be the right choice for other tasks or other robots.

With Theia, the same image from the robot can be fed through the encoder to generate a single representation with all the key information. That representation can then be input into Theia’s segmentation decoder to output a segmentation image. The same representation can be input into Theia’s depth decoder to output a depth image, and so on.

Each decoder uses the same representation as input because the shared representation possesses the information required to generate all the outputs from the original VFMs. This streamlines the training process and making actions transferable to a broader range of situations, said the researchers.

While it sounds easy for a person, the microwaving task represents a more complex behavior because it requires successful completion of multiple steps: picking up the object, placing it into the microwave, and closing the microwave door. The policy trained with Theia is among the top performers for each of these steps, comparable only to E-RADIO, another approach which also combines multiple VFMs, although not specifically for robotics applications.

Researchers used Theia to train a policy enabling a robot arm to microwave various types of toy food.

Researchers used Theia to train a policy enabling a robot arm to microwave various types of toy food. Source: The AI Institute

Theia prioritizes efficiency

One of Theia’s main advantages over other VFMs is its efficiency, said the AI Institute. Training Theia requires about 150 GPU hours on datasets like ImageNet, reducing the computational resources needed compared to other models.

This high efficiency does not come at the expense of performance, making Theia a practical choice for both research and application. With a smaller model size and reduced need for training data, Theia conserves computational resources during both the training and fine-tuning processes.

AI Institute sees transformation in robot learning

Theia enables robots to learn and adapt more quickly and effectively by refining knowledge from multiple vision models into compact representations for classification, segmentation, depth prediction, and other modalities.

While there is still much work to be done before reaching a 100% success rate on complex robotics tasks using Theia or other VFMs, Theia makes progress toward this goal while using less training data and fewer computational resources.

The AI Institute invited researchers and developers to explore Theia and further evaluate its capabilities to improve how robots learn and interpret their environments.

“We’re excited to see how Theia can contribute to both academic research and practical applications in robotics,” it said. Visit the AI Institute’s project page and demo page to learn more about Theia.


SITE AD for the 2025 Robotics Summit registration. Register now


The post The AI Institute introduces Theia vision foundation model to improve robot learning appeared first on The Robot Report.

]]>
https://www.therobotreport.com/theia-vision-foundation-model-aiinstitute-generates-improve-robot-learning/feed/ 0